کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515557 867045 2013 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Rank hash similarity for fast similarity search
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Rank hash similarity for fast similarity search
چکیده انگلیسی

The paper is concerned with similarity search at large scale, which efficiently and effectively finds similar data points for a query data point. An efficient way to accelerate similarity search is to learn hash functions. The existing approaches for learning hash functions aim to obtain low values of Hamming distances for the similar pairs. However, these methods ignore the ranking order of these Hamming distances. This leads to the poor accuracy about finding similar items for a query data point. In this paper, an algorithm is proposed, referred to top k RHS (Rank Hash Similarity), in which a ranking loss function is designed for learning a hash function. The hash function is hypothesized to be made up of l binary classifiers. The issue of learning a hash function can be formulated as a task of learning l binary classifiers. The algorithm runs l rounds and learns a binary classifier at each round. Compared with the existing approaches, the proposed method has the same order of computational complexity. Nevertheless, experiment results on three text datasets show that the proposed method obtains higher accuracy than the baselines.


► Ranking Hamming distances can boost retrieval accuracy for fast similarity search.
► To this end, a ranking loss function is designed to learn hash functions.
► Compared with the baselines, the method has the same computational complexity.
► Experimental results show the advantage of the method in retrieving similar items.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 49, Issue 1, January 2013, Pages 158–168
نویسندگان
, , , ,