دانلود رایگان مقاله: شاخص‌های سینگلتون برای جستجوی نزدیکترین همسایه

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
396783	670591	2016	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Singleton indexes for nearest neighbor search

ترجمه فارسی عنوان

شاخص‌های سینگلتون برای جستجوی نزدیکترین همسایه

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

جستجوی نزدیکترین همسایه؛ شاخص تنظیم. شاخص متریک؛ انتخاب محوری

Nearest neighbor search - نزدیکترین همسایه جستجو

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

شاخص‌های سینگلتون برای جستجوی نزدیکترین همسایه

چکیده انگلیسی

• A description and analysis of the new metric indexing methodology ANNI.
• A series of indexes based on this methodology.
• An extensive experimental comparison with the current state of the art.
• An opensource implementation of all the presented algorithms.

The nearest neighbor search problem is fundamental in computer science, and in spite of the effort of a vast number of research groups, the instances allowing an efficient solution are reduced to databases of objects of small intrinsic dimensions. For intrinsically high-dimensional data, the only possible solution is to compromise and use approximate or probabilistic approaches. For the rest of the instances in the middle, there is an overwhelmingly large number of indexes of claimed good performance. However, the problem of parameter selection makes them unwieldy for use outside of the research community. Even if the indexes can be tuned correctly, either the number of operations for the index construction and tuning is prohibitively large or there are obscure parameters to tune-up. Those restrictions force users from different fields to use brute force to solve the problem in real world instances.In this paper, we present a family of indexing algorithms designed for end users. They require as input, the database, a query sample and the amount of space available. Our building blocks are standard discarding rules, and the indexes will add routing objects such as pivots, hyperplane references or cluster centroids. Those indexes are built incrementally and will self-tune by greedily searching for a global optimum in performance.We experimentally show that using this oblivious strategy our indexes are able to outperform state of the art, manually fine-tuned indexes. For example, our indexes are twice as fast than the fastest alternative (LC, EPT or VPT) for most of our datasets. In the case of LC, the faster alternative for high dimensional datasets, the difference is smaller than 5%. In the same case, our indexes are at least one order of magnitude faster to build. This superior performance is maintained for large, high dimensional datasets (100 million 12-dimensional objects). In this benchmark, our best index is two times faster than the closest alternative (VPT), six times faster than the majority of indexes, and more than sixty times faster than the sequential scan.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 60, August–September 2016, Pages 50–68

نویسندگان

E.S. Tellez, G. Ruiz, E. Chavez,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : شاخص‌های سینگلتون برای جستجوی نزدیکترین همسایه

دسترسی سریع

ارتباط

English Website