Rough clustering utilizing the principle of indifference

Article ID	Journal	Published Year	Pages	File Type
6858054	Information Sciences	2014	17 Pages	PDF

Abstract

Clustering is one of the most widely used method in data mining with applications in virtually any domain. Its main objective is to group similar objects into the same cluster, while dissimilar objects should belong to different clusters. In particular k-means clustering, as member of the partitioning clustering family, has obtained great popularity. The classic (hard) k-means assigns an object unambiguously to one and only one cluster. To address uncertainty soft clustering has been introduced using concepts like fuzziness, possibility or roughness. A decade ago Lingras and West introduced a k-means approach based on the interval interpretation of rough sets theory. In the past years their rough k-means has gained increasing attention. In our paper, we propose a refined rough k-means algorithm that utilizes Laplace's principle of indifference to calculate the means. As we will discuss this provides a sounder justification for the impacts of the objects in the approximations in comparison to established rough k-means algorithms. Furthermore, the weighting in the mean function is based on individual objects rather than on aggregated sub-means. In experiments, we compare the refined algorithm to related approaches.

Keywords

Overlapping clusters