کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1180520 | 1491536 | 2015 | 12 صفحه PDF | دانلود رایگان |

• A new approach to test feature selection methods was proposed.
• To rank the investigated feature selection methods we proposed the new criterion of selected features redundancy.
• The procedure to test and compare feature selection methods was proposed.
• ElasticNet, Lasso, LARS, Ridge, Stepwise and Genetic algorithms were compared according to quality measures.
This study investigates the multicollinearity problem and the performance of feature selection methods in the case of data sets that have multicollinear features. We propose a stress test procedure for a set of feature selection methods. This procedure generates test data sets with various configurations of the target vector and features. This procedure provides more complex investigations of feature selection methods than procedures described in papers previously. A number of some multicollinear features are inserted in every configuration. A feature selection method results in a set of selected features for a given test data set. To compare given feature selection methods the procedure uses several quality measures. A criterion of the selected feature redundancy is proposed. This criterion estimates the number of multicollinear features among the selected ones. To detect multicollinearity it uses the eigensystem of the parameter covariance matrix. In computational experiments we consider the following illustrative methods: Lasso, ElasticNet, LARS, Ridge, Stepwise and Genetic algorithms and determine the best one, which solves the multicollinearity problem for every considered data set configuration.
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 142, 15 March 2015, Pages 172–183