Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
410363 | Neurocomputing | 2010 | 10 Pages |
Abstract
T-statistic is widely used for gene ranking in the analysis of microarray gene expressions. Such a filter based criterion is generally computed using all the training samples, all of which, however, may not be equally important for classification task. In this paper, we decompose the t-statistic into two parts, corresponding to relevant and irrelevant data points. The relevant data points are selected using support vectors and then used to compute t-statistic for feature selection. By simultaneously selecting data points and genes, significantly better classification results are achieved on synthetic as well as on several benchmark cancer datasets.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Piyushkumar A. Mundra, Jagath C. Rajapakse,