Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6861436 | Knowledge-Based Systems | 2018 | 15 Pages |
Abstract
k-Means algorithm has been proven an effective technique for clustering a large-scale data set. However, traditional k-means type clustering algorithms cannot effectively distinguish the discriminative capabilities of features in the clustering process. In this paper, we present a new k-means type clustering framework by extending W-k-means with an l2-norm regularization to the weights of features. Based on the framework, we propose the l2-Wkmeans algorithm by using conventional means as the centroids for clustering numerical data sets and present the l2-NOF and l2-NDM algorithms by using two different smooth modes representatives for clustering categorical data sets. At first, a new objective function is developed for the clustering framework. Then, the corresponding updating rules of the centroids, the membership matrix, and the weights of the features, are derived theoretically for the new algorithms. We conduct extensive experimental verifications to evaluate the performances of our proposed algorithms on numerical data sets and categorical data sets. Experimental studies demonstrate that our proposed algorithms delivers consistently promising results in comparison to the other comparative approaches, such basic k-means, W-k-means, MKM_NOF, MKM_NDM etc., with respects to four metrics: Accuracy, RandIndex, Fscore, and Normal Mutual Information (NMI).
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Xiaohui Huang, Xiaofei Yang, Junhui Zhao, Liyan Xiong, Yunming Ye,