Article ID Journal Published Year Pages File Type
563711 Signal Processing 2014 10 Pages PDF
Abstract

•A clustering method for estimating the number of clusters is proposed.•ACE is the difference between true cluster centers and estimated centers.•We probabilistically defined the unavailable ACE based on cluster compactness.•Minimizing the ACE leads to estimation of correct number of clusters.

In this paper, we tackle the problem of estimating the correct number of clusters from a new perspective. The proposed method probabilistically estimates the Average Central Error (ACE), which is the difference between the true cluster centers and their estimations. The novelty of this work is partly in estimating the unavailable ACE by using the available cluster compactness that is the difference between estimated centers and their members. The application of this approach is explored with K-means clustering. The proposed method denoted by Minimum ACE K-means (MACE-means) is shown to have unique advantages both with synthetic and real data. MACE-means clustering is applied to benchmark real world data sets from UCI machine learning repository and other synthesized clusters that represent a wide class of clustering scenarios. Our analysis confirms superiority of MACE-means over the state of the art clustering methods in robustness to the initialization error, accuracy in detecting the correct number of clusters, having less time complexity, and robustness to cluster overlapping.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,