کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515640 867057 2012 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Use of permutation prefixes for efficient and scalable approximate similarity search
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Use of permutation prefixes for efficient and scalable approximate similarity search
چکیده انگلیسی

We present the Permutation Prefix Index (this work is a revised and extended version of Esuli (2009b), presented at the 2009 LSDS-IR Workshop, held in Boston) (PP-Index), an index data structure that supports efficient approximate similarity search.The PP-Index belongs to the family of the permutation-based indexes, which are based on representing any indexed object with “its view of the surrounding world”, i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object.In its basic formulation, the PP-Index is strongly biased toward efficiency. We show how the effectiveness can easily reach optimal levels just by adopting two “boosting” strategies: multiple index search and multiple query search, which both have nice parallelization properties.We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.

Research highlights
► The Permutation Prefix Index is a data structure for approximate similarity search.
► PP-Index has good parallelization and scalability properties.
► PP-Index efficiently obtained high recall on an collection of 100 million images.
► PP-Index compares favorably against similar data structures for approximate search.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 48, Issue 5, September 2012, Pages 889–902
نویسندگان
,