Visual content representation using semantically similar visual words

Article ID	Journal	Published Year	Pages	File Type
385594	Expert Systems with Applications	2011	10 Pages	PDF

Abstract

Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘bag-of-visual words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. The key contributions of this paper are first, a novel approach for visual words construction which takes physically spatial information, angle, and scale of keypoints into account in order to preserve semantic information of objects in visual content and to enhance the traditional bag-of-visual words, is presented. Second, a method to identify and eliminate similar key points, to form semantic visual words of high quality and to strengthen the discrimination power for visual content classification, is given. Third, an approach to discover a set of semantically similar visual words and to form visual phrases representing visual content more distinctively and leading to narrowing the semantic gap is specified.

► An approach to reduce similar keypoints and preserve semantic information of visual content. ► A method to discover non-informative visual words resulting in strengthen the classification performance. ► A technique to discover a set of semantically similar visual words to represent visual content more effectively.

Keywords

Bag-of-visual words