Analysis of new variable selection methods for discriminant analysis

Article ID	Journal	Published Year	Pages	File Type
416651	Computational Statistics & Data Analysis	2006	16 Pages	PDF

Abstract

Several methods to select variables that are subsequently used in discriminant analysis are proposed and analysed. The aim is to find from among a set of m variables a smaller subset which enables an efficient classification of cases. Reducing dimensionality has some advantages such as reducing the costs of data acquisition, better understanding of the final classification model, and an increase in the efficiency and efficacy of the model itself. The specific problem consists in finding, for a small integer value of p, the size p subset of original variables that yields the greatest percentage of hits in the discriminant analysis. To solve this problem a series of techniques based on metaheuristic strategies is proposed. After performing some test it is found that they obtain significantly better results than the stepwise, backward or forward methods used by classic statistical packages. The way these methods work is illustrated with several examples.

Keywords

VNS Metaheuristics Variable selection Discriminant analysis Tabu search grasp Memetics