Modeling spatial and semantic cues for large-scale near-duplicated image retrieval

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
527885	869405	2011	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Visual vocabulary - واژگان ویژوال Local feature - ویژگی محلی Distance metric learning - یادگیری فاصله متریک

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Modeling spatial and semantic cues for large-scale near-duplicated image retrieval

چکیده انگلیسی

Bag-of-visual Words (BoW) image representation has been illustrated as one of the most promising solutions for large-scale near-duplicated image retrieval. However, the traditional visual vocabulary is created in an unsupervised way by clustering a large number of image local features. This is not ideal because it largely ignores the semantic and spatial contexts between local features. In this paper, we propose the geometric visual vocabulary which captures the spatial contexts by quantizing local features in bi-space, i.e., in descriptor space and orientation space. Then, we propose to capture the semantic context by learning a semantic-aware distance metric between local features, which could reasonably measure the semantic similarities between image patches, from which the local features are extracted. The learned distance is hence utilized to cluster the local features for semantic visual vocabulary generation. Finally, we combine the spatial and semantic contexts in a unified framework by extracting local feature groups, computing the spatial configurations between the local features inside the group, and learning a semantic-aware distance between groups. The learned group distance is then utilized to cluster the extracted local feature groups to generate a novel visual vocabulary, i.e., the contextual visual vocabulary. The proposed visual vocabularies, i.e., geometric visual vocabulary, semantic visual vocabulary and contextual visual vocabulary are tested in large-scale near-duplicated image retrieval applications. The geometric visual vocabulary and semantic visual vocabulary achieve better performance than the traditional visual vocabulary. Moreover, the contextual visual vocabulary, which combines both spatial and semantic clues outperforms the state-of-the-art bundled feature in both retrieval precision and efficiency.

Research highlights
► The proposed Geometric Visual Vocabulary captures the spatial contexts.
► The Semantic Visual Vocabulary captures the semantic contexts.
► We further propose the Contextual Visual Vocabulary which captures both the semantic and spatial contextual information.
► The semantic and geometric visual vocabulary perform better than the traditional visual vocabulary.
► The contextual visual vocabulary is proven more descriptive and outperforms the state-of-the-art algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Vision and Image Understanding - Volume 115, Issue 3, March 2011, Pages 403–414

نویسندگان

Shiliang Zhang, Qi Tian, Gang Hua, Wengang Zhou, Qingming Huang, Houqiang Li, Wen Gao,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Modeling spatial and semantic cues for large-scale near-duplicated image retrieval

دسترسی سریع

ارتباط

English Website