Article ID Journal Published Year Pages File Type
529933 Pattern Recognition 2015 15 Pages PDF
Abstract

•Compare common feature selection filter measures for use with specific classifiers.•Many tested filter measures do not reliably predict classifier accuracy.•Some measures have specific problems that cause them to select unsuitable features.•Best feature selection filter measure is classifier specific.

Feature selection is an important part of classifier design. There are many possible methods for searching and evaluating feature subsets, but little consensus on which methods are best. This paper examines a number of filter-based feature subset evaluation measures with the goal of assessing their performance with respect to specific classifiers.This work tests 16 common filter measures for use with K-nearest neighbors and support vector machine classifiers. The measures are tested on 20 real and 20 artificial data sets, which are designed to probe for specific challenges. The strengths and weaknesses of each measure are discussed with respect to the specific challenges and correlation with classifier accuracy. The results highlight several challenging problems with a number of common filter measures.The results indicate that the best filter measure is classifier-specific. K-nearest neighbors classifiers work well with subset-based RELIEF, correlation feature selection or conditional mutual information maximization, whereas Fisher׳s interclass separability criterion and conditional mutual information maximization work better for support vector machines. Despite the large number and variety of feature selection measures proposed in the literature, no single measure is guaranteed to outperform the others, even within a single classifier, and the overall performance of a feature selection method cannot be characterized independently of the subsequent classifier.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,