Approximate pairwise clustering for large data sets via sampling plus extension

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
530357	869761	2011	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Graph embedding - تعبیه گراف Spectral clustering - خوشه بندی طیفی out-of-sample extension - فرمت خارج از نمونه Selective sampling - نمونه گیری انتخابی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Approximate pairwise clustering for large data sets via sampling plus extension

چکیده انگلیسی

Pairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a “sampling, clustering plus extension” strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 44, Issue 2, February 2011, Pages 222–235

نویسندگان

Liang Wang, Christopher Leckie, Ramamohanarao Kotagiri, James Bezdek,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Approximate pairwise clustering for large data sets via sampling plus extension

دسترسی سریع

ارتباط

English Website