Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
566544 | Signal Processing | 2013 | 12 Pages |
In recent years, Locality sensitive hashing (LSH) has been popularly used as an effective and efficient index structure of multimedia signals. LSH is originally proposed for resolving the high-dimensional approximate similarity search problem. Until now, many kinds of variations of LSH have been proposed for large-scale indexing. Much of the interest is focused on improving the query accuracy for skewed data distribution and reducing the storage space. However, when using LSH, a final filtering process based on exact similarity measure is needed. When the dataset is large-scale, the number of points to be filtered becomes large. As a result, the filtering speed becomes the bottleneck of improving the query speed when the scale of data becomes larger and larger. Furthermore, we observe a “Non-Uniform” phenomenon in the most popular Euclidean LSH which can degrade the filtering speed dramatically. In this paper, a pivot-based algorithm is proposed to improve the filtering speed by using triangle inequality to prune the search process. Furthermore, a novel method to select an optimal pivot for even larger improvement is provided. The experimental results on two open large-scale datasets show that our method can significantly improve the query speed of Euclidean LSH.
► Analyse the “Non-Uniform” problem of Euclidean LSH formally. ► Propose a pivot-based algorithm to accelerate the query process of Euclidean LSH. ► Provide an effective method to get the optimal pivot point. ► Our method can significantly accelerate the query process of Euclidean LSH. ► Verify the feasibility of accelerating our algorithm through sampling.