Article ID Journal Published Year Pages File Type
6861635 Knowledge-Based Systems 2018 29 Pages PDF
Abstract
With the rapid growth of high-dimensional data sets in recent years, the need for reducing the dimensionality of data has grown significantly. Although wrapper approaches tend to achieve higher accuracy rates than filter techniques for the same number of selected features, only a few wrapper algorithms are applicable for high-dimensional data sets because the computational time becomes very excessive. We thus propose a new hybrid feature selection algorithm that is computationally efficient with high accuracy rates for high-dimensional data. The proposed method employs interaction information to guide the search, sequentially adds one feature at a time into the currently selected subset, and adopts early stopping to prevent overfitting and speed up the search. Our method is dynamic and selects only relevant and irredundant features that significantly improve the accuracy rates. Our experimental results for eleven high-dimensional data sets demonstrate that our algorithm consistently outperforms prior feature selection techniques, while requiring a reasonable amount of search time.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
,