کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
527943 869431 2009 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval
چکیده انگلیسی

Bag-of-visual-words (BoW) has recently become a popular representation to describe video and image content. Most existing approaches, nevertheless, neglect inter-word relatedness and measure similarity by bin-to-bin comparison of visual words in histograms. In this paper, we explore the linguistic and ontological aspects of visual words for video analysis. Two approaches, soft-weighting and constraint-based earth mover’s distance (CEMD), are proposed to model different aspects of visual word linguistics and proximity. In soft-weighting, visual words are cleverly weighted such that the linguistic meaning of words is taken into account for bin-to-bin histogram comparison. In CEMD, a cross-bin matching algorithm is formulated such that the ground distance measure considers the linguistic similarity of words. In particular, a BoW ontology which hierarchically specifies the hyponym relationship of words is constructed to assist the reasoning. We demonstrate soft-weighting and CEMD on two tasks: video semantic indexing and near-duplicate keyframe retrieval. Experimental results indicate that soft-weighting is superior to other popular weighting schemes such as term frequency (TF) weighting in large-scale video database. In addition, CEMD shows excellent performance compared to cosine similarity in near-duplicate retrieval.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Vision and Image Understanding - Volume 113, Issue 3, March 2009, Pages 405–414
نویسندگان
, ,