Article ID Journal Published Year Pages File Type
566544 Signal Processing 2013 12 Pages PDF
Abstract

In recent years, Locality sensitive hashing (LSH) has been popularly used as an effective and efficient index structure of multimedia signals. LSH is originally proposed for resolving the high-dimensional approximate similarity search problem. Until now, many kinds of variations of LSH have been proposed for large-scale indexing. Much of the interest is focused on improving the query accuracy for skewed data distribution and reducing the storage space. However, when using LSH, a final filtering process based on exact similarity measure is needed. When the dataset is large-scale, the number of points to be filtered becomes large. As a result, the filtering speed becomes the bottleneck of improving the query speed when the scale of data becomes larger and larger. Furthermore, we observe a “Non-Uniform” phenomenon in the most popular Euclidean LSH which can degrade the filtering speed dramatically. In this paper, a pivot-based algorithm is proposed to improve the filtering speed by using triangle inequality to prune the search process. Furthermore, a novel method to select an optimal pivot for even larger improvement is provided. The experimental results on two open large-scale datasets show that our method can significantly improve the query speed of Euclidean LSH.

► Analyse the “Non-Uniform” problem of Euclidean LSH formally. ► Propose a pivot-based algorithm to accelerate the query process of Euclidean LSH. ► Provide an effective method to get the optimal pivot point. ► Our method can significantly accelerate the query process of Euclidean LSH. ► Verify the feasibility of accelerating our algorithm through sampling.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , , ,