Encoding multiple contextual clues for partial-duplicate image retrieval

Article ID	Journal	Published Year	Pages	File Type
6940287	Pattern Recognition Letters	2018	9 Pages	PDF

Abstract

Recently, most of partial-duplicate image retrieval approaches build on the bag-of-visual-words (BOW) model, in which local image features are quantized into a bag of compact visual words, i.e., BOW representation, for fast image matching. However, due to the quantization errors in visual words, the BOW representation shows low discriminability, which causes negative influence on retrieval accuracy. Encoding contextual clues into the BOW representation is a popular technique to improve its discriminability. Unfortunately, the captured contextual clues are generally not stable and informative enough, resulting in limited discriminability improvement. To address the issues, we propose a multiple contextual clue encoding approach for partial-duplicate image retrieval. By treating each visual word of any given query or database image as a center, we first propose an asymmetrical context selection strategy to select the contextual visual words for the query and database images differently. Then, we capture the multiple contextual clues: the geometric relationships, the visual relationships, and the spatial configurations between the center and its contextual visual words. These captured multiple contextual clues are compressed to generate the multi-contextual descriptors, which are further integrated with the center visual word to improve the discriminability of BOW representation. Experiments conducted on the large-scale partial-duplicate image dataset demonstrate that the proposed approach provides higher retrieval accuracy than the state-of-the-arts, while achieves comparable performances in time and space efficiency.

Keywords

Image copy detection Image search