Two-level k-means clustering algorithm for k –ττ relationship establishment and linear-time classification

Article ID	Journal	Published Year	Pages	File Type
533594	Pattern Recognition	2010	9 Pages	PDF

Abstract

Partitional clustering algorithms, which partition the dataset into a pre-defined number of clusters, can be broadly classified into two types: algorithms which explicitly take the number of clusters as input and algorithms that take the expected size of a cluster as input. In this paper, we propose a variant of the k-means algorithm and prove that it is more efficient than standard k-means algorithms. An important contribution of this paper is the establishment of a relation between the number of clusters and the size of the clusters in a dataset through the analysis of our algorithm. We also demonstrate that the integration of this algorithm as a pre-processing step in classification algorithms reduces their running-time complexity.

Keywords

k-Nearest neighbor classifier Clustering Classification Support vector machines k-Means