Article ID Journal Published Year Pages File Type
398507 International Journal of Approximate Reasoning 2008 12 Pages PDF
Abstract

In data mining, searching for simple representations of knowledge is a very important issue. Attribute reduction, continuous attribute discretization and symbolic value partition are three preprocessing techniques which are used in this regard. This paper investigates the symbolic value partition technique, which divides each attribute domain of a data table into a family for disjoint subsets, and constructs a new data table with fewer attributes and smaller attribute domains. Specifically, we investigates the optimal symbolic value partition (OSVP) problem of supervised data, where the optimal metric is defined by the cardinality sum of new attribute domains. We propose the concept of partition reducts for this problem. An optimal partition reduct is the solution to the OSVP-problem. We develop a greedy algorithm to search for a suboptimal partition reduct, and analyze major properties of the proposed algorithm. Empirical studies on various datasets from the UCI library show that our algorithm effectively reduces the size of attribute domains. Furthermore, it assists in computing smaller rule sets with better coverage compared with the attribute reduction approach.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence