Article ID Journal Published Year Pages File Type
535280 Pattern Recognition Letters 2015 7 Pages PDF
Abstract

•The paper investigates if there is really a need for rough clustering in data mining.•We show that rough clustering provides more detailed results than hard approaches.•Rough clustering minimizes the number of incorrectly clustered objects.•Trade-offs are configurable between correctly, incorrectly objects and do not knows.•We show that rough clustering suits risk aversion that is typical for most humans.

Clustering plays an important role in data mining. Some of the most famous clustering methods belong to the family of k-means algorithms. A decade ago, Lingras and West enriched the field of soft clustering by introducing rough k-means. Although rough clustering has been a very active field of research a pointed evaluation if it is really needed is still missing. Thus, the objective of the paper is to compare rough k-means and k-means. In k-means the number of correctly clustered objects is to be maximized which corresponds to minimizing the number of incorrectly clustered objects. In contrast to k-means, in rough clustering the numbers of correctly and incorrectly clustered objects are not complements anymore. Hence, in rough clustering the number of incorrectly clustered objects can be explicitly minimized. This is of striking relevance for many real life applications where minimizing the number of incorrectly clustered objects is more important than maximizing the number of correctly clustered objects. Therefore, we argue that rough k-means is often a strong alternative to k-means.

Graphical abstractRough Clustering establishes buffer zones (boundaries) between clusters for objects with unclear memberships to reduce the number of incorrectly clustered objects. The boundaries allow a more detailed analysis of the clustering results in comparison to hard algorithms (e.g., k-means), in the sense of three way decisions: (1) an object is member of a certain cluster, (2) an objects is not a member of a certain cluster, (3) the membership cannot be decided. Obviously, this also suits risk aversion that is typical for most humans.Figure optionsDownload full-size imageDownload high-quality image (115 K)Download as PowerPoint slide

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
,