Article ID Journal Published Year Pages File Type
4946305 Knowledge-Based Systems 2017 34 Pages PDF
Abstract
Feature selection problem in data mining is addressed here by proposing a bi-objective genetic algorithm based feature selection method. Boundary region analysis of rough set theory and multivariate mutual information of information theory are used as two objective functions in the proposed work, to select only precise and informative data from the data set. Data set is sampled with replacement strategy and the method is applied to determine non-dominated feature subsets from each sampled data set. Finally, ensemble of such bi-objective genetic algorithm based feature selectors is developed with the help of parallel implementations to produce much generalized feature subset. In fact, individual feature selector outputs are aggregated using a novel dominance based principle to produce final feature subset. Proposed work is validated using repository especially for feature selection datasets as well as on UCI machine learning repository datasets and the experimental results are compared with related state of art feature selection methods to show effectiveness of the proposed ensemble feature selection method.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,