کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6863752 | 1439520 | 2018 | 33 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Large-scale k-means clustering via variance reduction
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
With the increase of the volume of data such as images in web, it is challenging to perform k-means clustering on millions or even billions of images efficiently. One of the reasons is that k-means requires to use a batch of training data to update cluster centers at every iteration, which is time-consuming. Conventionally, k-means is accelerated by using one or a mini-batch of instances to update the centers, which leads to a bad performance due to the stochastic noise. In the paper, we decrease such stochastic noise, and accelerate k-means by using variance reduction technique. Specifically, we propose a position correction mechanism to correct the drift of the cluster centers, and propose a variance reduced k-means named VRKM. Furthermore, we optimize VRKM by reducing its computational cost, and propose a new variant of the variance reduced k-means named VRKM++. Comparing with VRKM, VRKM++ does not have to compute the batch gradient, and is more efficient. Extensive empirical studies show that our methods VRKM and VRKM++ outperform the state-of-the-art method, and obtain about 2â¯Ã⯠and 4â¯Ã⯠speedups for large-scale clustering, respectively. The source code is available at https://www.github.com/YaweiZhao/VRKM_sofia-ml.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 307, 13 September 2018, Pages 184-194
Journal: Neurocomputing - Volume 307, 13 September 2018, Pages 184-194
نویسندگان
Yawei Zhao, Yuewei Ming, Xinwang Liu, En Zhu, Kaikai Zhao, Jianping Yin,