کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6866543 679631 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The k-modes type clustering plus between-cluster information for categorical data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
The k-modes type clustering plus between-cluster information for categorical data
چکیده انگلیسی
The k-modes algorithm and its modified versions are widely used to cluster categorical data. However, in the iterative process of these algorithms, the updating formulae, such as the partition matrix, cluster centers and attribute weights, are computed based on within-cluster information only. The between-cluster information is not considered, which maybe result in the clustering results with weak separation among different clusters. Therefore, in this paper, we propose a new term which is used to reflect the separation. Furthermore, the new optimization objective functions are developed by adding the proposed term to the objective functions of several existing k-modes algorithms. Under the optimization framework, the corresponding updating formulae and convergence of the iterative process is strictly derived. The above improvements are used to enhance the effectiveness of these existing k-modes algorithms whilst keeping them simple. The experimental studies on real data sets from the UCI (University of California Irvine) Machine Learning Repository illustrate that these improved algorithms outperform their original counterparts in clustering categorical data sets and are also scalable to large data sets for their linear time complexity with respect to either the number of data objects, attributes or clusters.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 133, 10 June 2014, Pages 111-121
نویسندگان
, ,