کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
530777 | 869787 | 2014 | 9 صفحه PDF | دانلود رایگان |
• A new framework for cost-based feature selection is proposed.
• Two representative filters are modified to perform cost-based feature selection.
• We test the framework over a heterogeneous set of 17 datasets.
• A SVM is chosen to evaluate the performance of the proposed approach.
• The cost is minimized without compromising the classification error.
Over the last few years, the dimensionality of datasets involved in data mining applications has increased dramatically. In this situation, feature selection becomes indispensable as it allows for dimensionality reduction and relevance detection. The research proposed in this paper broadens the scope of feature selection by taking into consideration not only the relevance of the features but also their associated costs. A new general framework is proposed, which consists of adding a new term to the evaluation function of a filter feature selection method so that the cost is taken into account. Although the proposed methodology could be applied to any feature selection filter, in this paper the approach is applied to two representative filter methods: Correlation-based Feature Selection (CFS) and Minimal-Redundancy-Maximal-Relevance (mRMR), as an example of use. The behavior of the proposed framework is tested on 17 heterogeneous classification datasets, employing a Support Vector Machine (SVM) as a classifier. The results of the experimental study show that the approach is sound and that it allows the user to reduce the cost without compromising the classification error.
Journal: Pattern Recognition - Volume 47, Issue 7, July 2014, Pages 2481–2489