Article ID Journal Published Year Pages File Type
402670 Knowledge-Based Systems 2014 12 Pages PDF
Abstract

Several clustering analysis techniques for categorical data exist to divide similar objects into groups. Some are able to handle uncertainty in the clustering process, whereas others have stability issues. In this paper, we propose a new technique called TMDP (Total Mean Distribution Precision) for selecting the partitioning attribute based on probabilistic rough set theory. On the basis of this technique, with the concept of granularity, we derive a new clustering algorithm, MTMDP (Maximum Total Mean Distribution Precision), for categorical data. The MTMDP algorithm is a robust clustering algorithm that handles uncertainty in the process of clustering categorical data. We compare the MTMDP algorithm with the MMR (Min–Min–Roughness) algorithm which is the most relevant clustering algorithm, and also compared it with other unstable clustering algorithms, such as k-modes, fuzzy k-modes and fuzzy centroids. The experimental results indicate that the MTMDP algorithm can be successfully used to analyze grouped categorical data because it produces better clustering results.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , ,