Article ID Journal Published Year Pages File Type
410363 Neurocomputing 2010 10 Pages PDF
Abstract

T-statistic is widely used for gene ranking in the analysis of microarray gene expressions. Such a filter based criterion is generally computed using all the training samples, all of which, however, may not be equally important for classification task. In this paper, we decompose the t-statistic into two parts, corresponding to relevant and irrelevant data points. The relevant data points are selected using support vectors and then used to compute t-statistic for feature selection. By simultaneously selecting data points and genes, significantly better classification results are achieved on synthetic as well as on several benchmark cancer datasets.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,