کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
489838 704634 2015 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
The k-modes Algorithm with Entropy Based Similarity Coefficient
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
The k-modes Algorithm with Entropy Based Similarity Coefficient
چکیده انگلیسی

Clustering is the process of organizing dataset into isolated groups such that data points in the same are more similar and data points of different groups are more dissimilar. The k-modes algorithm well known for its simplicity is a popular partitioning algorithm for clustering categorical data. In this paper, we discuss the limitations of distance function used in this algorithm with an illustrative example and then we propose a similarity coefficient based on Information Entropy. We analyze the time complexity of the k-modes algorithm with proposed similarity coefficient. The main advantage of this coefficient is that it improves the clustering accuracy while retaining scalability of the k-modes algorithm. We perform the scalability tests on synthetic datasets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 50, 2015, Pages 93-98