کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
384139 660841 2012 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A cluster centers initialization method for clustering categorical data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
A cluster centers initialization method for clustering categorical data
چکیده انگلیسی

The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers. Currently, most methods of initialization cluster centers are mainly for numerical data. Due to lack of geometry for the categorical data, these methods used in cluster centers initialization for numerical data are not applicable to categorical data. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes algorithm. The method integrates the distance and the density together to select initial cluster centers and overcomes shortcomings of the existing initialization methods for categorical data. Experimental results illustrate the proposed initialization method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data objects.


► Propose a new cluster centers selection method for categorical data.
► Several rules are presented to select initial cluster centers from data sets.
► Integrate distance and density to avoid outliers and boundary points.
► Apply neighbor objects to construct the candidates for initial cluster centers.
► The performance and scalability of the proposed method is investigated.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 39, Issue 9, July 2012, Pages 8022–8029
نویسندگان
, , , ,