Article ID Journal Published Year Pages File Type
534861 Pattern Recognition Letters 2011 13 Pages PDF
Abstract

This paper investigates feature selection based on rough sets for dimensionality reduction in Case-Based Reasoning classifiers. In order to be useful, Case-Based Reasoning systems should be able to manage imprecise, uncertain and redundant data to retrieve the most relevant information in a potentially overwhelming quantity of data. Rough Set Theory has been shown to be an effective tool for data mining and for uncertainty management. This paper has two central contributions: (1) it develops three strategies for feature selection, and (2) it proposes several measures for estimating attribute relevance based on Rough Set Theory. Although we concentrate on Case-Based Reasoning classifiers, the proposals are general enough to be applicable to a wide range of learning algorithms. We applied these proposals on twenty data sets from the UCI repository and examined the impact of feature selection over classification performance. Our evaluation shows that all three proposals benefit the basic Case-Based Reasoning system. They also present robustness in comparison to well-known feature selection strategies.

Research Highlights► Feature selection can improve prediction accuracy in most classifiers, including CBR. ► Reducts from Rough Sets Theory are useful for extracting patterns of features. ►Reducts can be used for extracting different feature relevance measures. ► This paper applies various filter feature selection strategies based on these measures. ► They happen to be fast and a good tradeoff between accuracy and feature reduction.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,