کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
10361295 | 870090 | 2015 | 40 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A distributed framework for trimmed Kernel k-Means clustering
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
Data clustering is an unsupervised learning task that has found many applications in various scientific fields. The goal is to find subgroups of closely related data samples (clusters) in a set of unlabeled data. Kernel k-Means is a state of the art clustering algorithm. However, in contrast to clustering algorithms that can work using only a limited percentage of the data at a time, Kernel k-Means is a global clustering algorithm. It requires the computation of the kernel matrix, which takes O(n2d) time and O(n2) space in memory. As datasets grow larger, the application of Kernel k-Means becomes infeasible on a single computer, a fact that strongly suggests a distributed approach. In this paper, we present such an approach to the Kernel k-Means clustering algorithm, in order to make its application to a large number of samples feasible and, thus, achieve high performance clustering results on very big datasets. Our distributed approach follows the MapReduce programming model and consists of 3 stages, the kernel matrix computation, a novel kernel matrix trimming method and the Kernel k-Means clustering algorithm.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 8, August 2015, Pages 2685-2698
Journal: Pattern Recognition - Volume 48, Issue 8, August 2015, Pages 2685-2698
نویسندگان
Nikolaos Tsapanos, Anastasios Tefas, Nikolaos Nikolaidis, Ioannis Pitas,