Article ID Journal Published Year Pages File Type
416651 Computational Statistics & Data Analysis 2006 16 Pages PDF
Abstract

Several methods to select variables that are subsequently used in discriminant analysis are proposed and analysed. The aim is to find from among a set of m variables a smaller subset which enables an efficient classification of cases. Reducing dimensionality has some advantages such as reducing the costs of data acquisition, better understanding of the final classification model, and an increase in the efficiency and efficacy of the model itself. The specific problem consists in finding, for a small integer value of p, the size p subset of original variables that yields the greatest percentage of hits in the discriminant analysis. To solve this problem a series of techniques based on metaheuristic strategies is proposed. After performing some test it is found that they obtain significantly better results than the stepwise, backward or forward methods used by classic statistical packages. The way these methods work is illustrated with several examples.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,