کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
402333 676906 2014 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A heuristic approach to effective and efficient clustering on uncertain objects
ترجمه فارسی عنوان
رویکرد اکتشافی به خوشه موثر و کارآمد در اشیاء نامشخص
کلمات کلیدی
خوشه بندی اشیاء نامشخص، انگلستان، یعنی، فاصله اقلیدسی انتظار می رود، فاصله اقلیدس مربع پیش بینی شده
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We reduce UK-means to K-means.
• We experimentally show that K-means performs much faster than existing pruning algorithms.
• We propose Approximate UK-means to heuristically identify boundary objects and re-assign them to better clusters.
• We propose three models for the representation of cluster representative. To our knowledge, this is the first time to introduce uncertain model of cluster representative.

We study the problem of clustering uncertain objects whose locations are uncertain and described by probability density functions (pdf). We analyze existing pruning algorithms and experimentally show that there exists a new bottleneck in the performance due to the overhead of pruning candidate clusters for assignment of each uncertain object in each iteration. In this article, we will show that by considering squared Euclidean distance, UK-means (without pruning techniques) is reduced to K-means and performs much faster than pruning algorithms, however, with some discrepancies in the clustering results due to using different distance functions. Thus, we propose Approximate UK-means to heuristically identify objects of boundary cases and re-assign them to better clusters. Three models for the representation of cluster representative (certain model, uncertain model and heuristic model) are proposed to calculate expected squared Euclidean distance between objects and cluster representatives in this paper. Our experimental results show that on average the execution time of Approximate UK-means is only 25% more than K-means and our approach reduces the discrepancies of K-means’ clustering results by up to 70%.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 66, August 2014, Pages 112–125
نویسندگان
, , , ,