A new algorithm for initial cluster centers in k-means algorithm

Article ID	Journal	Published Year	Pages	File Type
535928	Pattern Recognition Letters	2011	5 Pages	PDF

Abstract

Clustering is one of the widely used knowledge discovery techniques to reveal structures in a dataset that can be extremely useful to the analyst. In iterative clustering algorithms the procedure adopted for choosing initial cluster centers is extremely important as it has a direct impact on the formation of final clusters. Since clusters are separated groups in a feature space, it is desirable to select initial centers which are well separated. In this paper, we have proposed an algorithm to compute initial cluster centers for k-means algorithm. The algorithm is applied to several different datasets in different dimension for illustrative purposes. It is observed that the newly proposed algorithm has good performance to obtain the initial cluster centers for the k-means algorithm.

► We proposed an algorithm to compute initial cluster centers for k-means algorithm. ► We choose two variables that best describe the variation in the dataset. ► We used real datasets to show practical applicability of the proposed algorithm. ► The newly proposed algorithm has good perform to obtain the initial cluster centers.

Keywords

k-means algorithm Rand index Initial cluster centers