کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536946 870648 2005 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
QROCK: A quick version of the ROCK algorithm for clustering of categorical data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
QROCK: A quick version of the ROCK algorithm for clustering of categorical data
چکیده انگلیسی

The ROCK algorithm is an agglomerative hierarchical clustering algorithm for clustering categorical data [Guha S., Rastogi, R., Shim, K., 1999. ROCK: A robust clustering algorithm for categorical attributes. In: Proc. IEEE Internat. Conf. Data Engineering, Sydney, March 1999]. In this paper we prove that under certain conditions, the final clusters obtained by the algorithm are nothing but the connected components of a certain graph with the input data-points as vertices. We propose a new algorithm QROCK which computes the clusters by determining the connected components of the graph. This leads to a very efficient method of obtaining the clusters giving a drastic reduction of the computing time of the ROCK algorithm. We also justify that it is more practical for specifying the similarity threshold rather than specifying the desired number of clusters a priori. The QROCK algorithm also detects the outliers in this process. We also discuss a new similarity measure for categorical attributes.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 26, Issue 15, November 2005, Pages 2364–2373
نویسندگان
, , ,