An efficient k-means clustering filtering algorithm using density based initial cluster centers

Article ID	Journal	Published Year	Pages	File Type
4944221	Information Sciences	2017	6 Pages	PDF

Abstract

k-means is a preeminent partitional based clustering method that finds k clusters from the given dataset by computing distances from each point to k cluster centers iteratively. The filtering algorithm improves the performance of k-means by imposing an index structure on the dataset and reduces the number of cluster centers searched while finding the nearest center of a point. The performance of filtering algorithm is influenced by the degree of separation between initial cluster centers. In this paper, we propose an efficient initial seed selection method, RDBI, to improve the performance of k-means filtering method by locating the seed points at dense areas of the dataset and well separated. The dense areas are identified by representing the data points in a kd-tree. A comprehensive experimental analysis is performed to evaluate the performance efficiency of proposed method against state-of-the-art initialization methods and shown that the proposed method is efficient in terms of both running time and clustering accuracy.

Keywords

K-means clustering kd-tree Initial cluster centers Knowledge discovery