Article Preview
Top1. Introduction
Diabetes is a chronic disease that affects humans regardless of their age and its causes, many of which are genetic and related to illness, impact and cause shock symptoms such as thirst, persistent fatigue, mobility issues and sweating. Patients suffering from diabetes die because of nephropathy leading to long-lasting problems such as cardiovascular macroangiopathy because harmful effects of hyperglycemia are prolonged in tissues. In terms of the pathophysiology, the disease is a classic metabolic condition of insulin-resistance in patients of type 2 diabetes. It can lead to compensatory hyperinsulinemia, which brings about a proliferative influence in the cellular vascular wall component, increasing the risk of cardiovascular diseases (Kharroubi, 2015), (di Camillo et al., 2010). There are ways to treat this disease such as by injecting insulin and through pills or herbal aid. The disease can lead to infection and complications of the kidney, eyes, brain, and other organs. In the endothelium, the transcriptional modifications characterisation is a key stage of a well considerate the mechanism of insulin action as well as the relationship between insulin resistance and dysfunction of endothelial cells [2]–(Statnikov, Aliferis, Tsamardinos, Hardin, & Levy, 2005). Microarrays are a key tool for profiling the global gene expression patterns of tissues and cells. At present, such findings contain thousands of genes but few samples (Li, Weinberg, Darden, & Pedersen, 2001). A important challenge in biomedical studies in latest research concerns whether the data from samples can be classified and inferred into specific diseases [6]–(Babu & Sarkar, 2017).
Developing a suitable classifier and using training examples for genetic diagnosis is a problem in this area. herein this study, the challenge is to classify genes into control and exposed to insulin categories (Vanitha, Devaraj, & Venkatesulu, 2015a). Therefore, the k-nearest neighbours (KNN) approach of non-parametric pattern recognition is applied. Since the data set consists of several thousands of genes with few samples, for a specific dataset, many subsets of genes that can be classified under different sample classes may exist. Many subsets were found and the significance of genes was considered in the classification of the samples by examining the membership frequency of the genes in these near-optimal sets [5], [10]–(Bouazza, Hamdi, Zeroual, & Auhmani, 2015). While KNN is simple and clinically attractive, a large number of performance alternatives were found among groups for experienced data analysis (Sheela & Rangarajan, 2018), (Vanitha, Devaraj, & Venkatesulu, 2015b). The dimensionality reduction of the dataset variable space is an important and key pre-processing step for all the classification and clustering methods. Still, it is unknown whether increasing the specific genes’ transcription for cellular proliferation is due to insulin itself in the endothelium or not. In this work, the classifier makes decision either control or exposed.