کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
382872 660794 2015 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient agglomerative hierarchical clustering
ترجمه فارسی عنوان
خوشه بندی سلسله مراتبی خوشه ای کارآمد
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• An efficient hybrid hierarchical clustering is proposed based on agglomerative method.
• It performs consistently with different distance measures.
• It performs consistently on data with different distributions and sizes.

Hierarchical clustering is of great importance in data analytics especially because of the exponential growth of real-world data. Often these data are unlabelled and there is little prior domain knowledge available. One challenge in handling these huge data collections is the computational cost. In this paper, we aim to improve the efficiency by introducing a set of methods of agglomerative hierarchical clustering. Instead of building cluster hierarchies based on raw data points, our approach builds a hierarchy based on a group of centroids. These centroids represent a group of adjacent points in the data space. By this approach, feature extraction or dimensionality reduction is not required. To evaluate our approach, we have conducted a comprehensive experimental study. We tested the approach with different clustering methods (i.e., UPGMA and SLINK), data distributions, (i.e., normal and uniform), and distance measures (i.e., Euclidean and Canberra). The experimental results indicate that, using the centroid based approach, computational cost can be significantly reduced without compromising the clustering performance. The performance of this approach is relatively consistent regardless the variation of the settings, i.e., clustering methods, data distributions, and distance measures.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 42, Issue 5, 1 April 2015, Pages 2785–2797
نویسندگان
, , , , ,