Article ID Journal Published Year Pages File Type
2820843 Genomics 2013 11 Pages PDF
Abstract

An important application of gene expression data is to classify samples in a variety of diagnostic fields. However, high dimensionality and a small number of noisy samples pose significant challenges to existing classification methods. Focused on the problems of overfitting and sensitivity to noise of the dataset in the classification of microarray data, we propose an interval-valued analysis method based on a rough set technique to select discriminative genes and to use these genes to classify tissue samples of microarray data. We first select a small subset of genes based on interval-valued rough set by considering the preference-ordered domains of the gene expression data, and then classify test samples into certain classes with a term of similar degree. Experiments show that the proposed method is able to reach high prediction accuracies with a small number of selected genes and its performance is robust to noise.

► A rough set based method is proposed to analyze microarray data. ► The method gets high prediction accuracies with a small selected gene set. ► The method is robust to noise. ► The method is more outstanding when sample size is reduced.

Related Topics
Life Sciences Biochemistry, Genetics and Molecular Biology Genetics
Authors
, ,