کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566544 875994 2013 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An improved method of locality sensitive hashing for indexing large-scale and high-dimensional features
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
An improved method of locality sensitive hashing for indexing large-scale and high-dimensional features
چکیده انگلیسی

In recent years, Locality sensitive hashing (LSH) has been popularly used as an effective and efficient index structure of multimedia signals. LSH is originally proposed for resolving the high-dimensional approximate similarity search problem. Until now, many kinds of variations of LSH have been proposed for large-scale indexing. Much of the interest is focused on improving the query accuracy for skewed data distribution and reducing the storage space. However, when using LSH, a final filtering process based on exact similarity measure is needed. When the dataset is large-scale, the number of points to be filtered becomes large. As a result, the filtering speed becomes the bottleneck of improving the query speed when the scale of data becomes larger and larger. Furthermore, we observe a “Non-Uniform” phenomenon in the most popular Euclidean LSH which can degrade the filtering speed dramatically. In this paper, a pivot-based algorithm is proposed to improve the filtering speed by using triangle inequality to prune the search process. Furthermore, a novel method to select an optimal pivot for even larger improvement is provided. The experimental results on two open large-scale datasets show that our method can significantly improve the query speed of Euclidean LSH.


► Analyse the “Non-Uniform” problem of Euclidean LSH formally.
► Propose a pivot-based algorithm to accelerate the query process of Euclidean LSH.
► Provide an effective method to get the optimal pivot point.
► Our method can significantly accelerate the query process of Euclidean LSH.
► Verify the feasibility of accelerating our algorithm through sampling.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Signal Processing - Volume 93, Issue 8, August 2013, Pages 2244–2255
نویسندگان
, , , , ,