Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
537027	870672	2011	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Video copy detection - تشخیص کپی ویدیو Semantic features - ویژگی های معنایی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

چکیده انگلیسی

The detection of near-duplicate video clips (NDVCs) is an area of current research interest and intense development. Most NDVC detection methods represent video clips with a unique set of low-level visual features, typically describing color or texture information. However, low-level visual features are sensitive to transformations of the video content. Given the observation that transformations tend to preserve the semantic information conveyed by the video content, we propose a novel approach for identifying NDVCs, making use of both low-level visual features (this is, MPEG-7 visual features) and high-level semantic features (this is, 32 semantic concepts detected using trained classifiers). Experimental results obtained for the publicly available MUSCLE-VCD-2007 and TRECVID 2008 video sets show that bimodal fusion of visual and semantic features facilitates robust NDVC detection. In particular, the proposed method is able to identify NDVCs with a low missed detection rate (3% on average) and a low false alarm rate (2% on average). In addition, the combined use of visual and semantic features outperforms the separate use of either of them in terms of NDVC detection effectiveness. Further, we demonstrate that the effectiveness of the proposed method is on par with or better than the effectiveness of three state-of-the-art NDVC detection methods either making use of temporal ordinal measurement, features computed using the Scale-Invariant Feature Transform (SIFT), or bag-of-visual-words (BoVW). We also show that the influence of the effectiveness of semantic concept detection on the effectiveness of NDVC detection is limited, as long as the mean average precision (MAP) of the semantic concept detectors used is higher than 0.3. Finally, we illustrate that the computational complexity of our NDVC detection method is competitive with the computational complexity of the three aforementioned NDVC detection methods.

▸ This paper proposes a novel method for detecting near-duplicate video clips (NDVCs). ▸ The proposed method fuses MPEG-7 visual features and semantic features. ▸ The semantic features are extracted by means of 32 trained classifiers. ▸ The proposed method facilitates robust NDVC detection.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Signal Processing: Image Communication - Volume 26, Issue 10, November 2011, Pages 612–627

نویسندگان

Hyun-seok Min, Jae Young Choi, Wesley De Neve, Yong Man Ro,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

دسترسی سریع

ارتباط

English Website