An efficient feature selection algorithm for hybrid data

Article ID	Journal	Published Year	Pages	File Type
6865388	Neurocomputing	2016	14 Pages	PDF

Abstract

Feature selection for large-scale data sets has been conceived as a very important data preprocessing step in the area of machine learning. Data sets in real databases usually take on hybrid forms, i.e., the coexistence of categorical and numerical data. In this paper, based on the idea of decomposition and fusion, an efficient feature selection approach for large-scale hybrid data sets is studied. According to this approach, one can get an effective feature subset in a much shorter time. By employing two common classifiers as the evaluation function, experiments have been carried out on twelve UCI data sets. The experimental results show that the proposed approach is effective and efficient.

Keywords

Feature selection Rough set theory Hybrid data