کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
525774 | 869024 | 2012 | 12 صفحه PDF | دانلود رایگان |

In this paper, we address the incoherence problem of the visual words in bag-of-words vocabularies. Different from existing work, which assigns words based on closeness in descriptor space, we focus on identifying pairs of independent, distant words – the visual synonyms – that are likely to host image patches of similar visual reality. We focus on landmark images, where the image geometry guides the detection of synonym pairs. Image geometry is used to find those image features that lie in the nearly identical physical location, yet are assigned to different words of the visual vocabulary. Defined in this way, we evaluate the validity of visual synonyms. We also examine the closeness of synonyms in the L2-normalized feature space. We show that visual synonyms may successfully be used for vocabulary reduction. Furthermore, we show that combining the reduced visual vocabularies with synonym augmentation, we perform on par with the state-of-the-art bag-of-words approach, while having a 98% smaller vocabulary.
► We study the visual word incoherence, which is characterized by visually similar patches being hosted in different, not neighboring visual words.
► Visual synonyms, are linkages between visual words containing visually similar patches and present an algorithm for extracting visual synonyms.
► We show that visual synonyms are not necessarily nearest neighbor words, although they may contain visually similar patches.
► We also demonstrate how visual synonyms allow for controllable construction of up to 98% smaller, yet power vacobularies.
Journal: Computer Vision and Image Understanding - Volume 116, Issue 2, February 2012, Pages 238–249