Fusing audio vocabulary with visual features for pornographic video detection

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
425677	685814	2014	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Fusing audio vocabulary with visual features for pornographic video detection

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Multimodal fusion - تلفیق چندجملهای

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

Fusing audio vocabulary with visual features for pornographic video detection

چکیده انگلیسی

Pornographic video detection based on multimodal fusion is an effective approach for filtering pornography. However, existing methods lack accurate representation of audio semantics and pay little attention to the characteristics of pornographic audios. In this paper, we propose a novel framework of fusing audio vocabulary with visual features for pornographic video detection. The novelty of our approach lies in three aspects: an audio semantics representation method based on an energy envelope unit (EEU) and bag-of-words (BoW), a periodicity-based audio segmentation algorithm, and a periodicity-based video decision algorithm. The first one, named the EEU+BoW representation method, is proposed to describe the audio semantics via an audio vocabulary. The audio vocabulary is constructed by k-means clustering of EEUs. The latter two aspects echo with each other to make full use of the periodicities in pornographic audios. Using the periodicity-based audio segmentation algorithm, audio streams are divided into EEU sequences. After these EEUs are classified, videos are judged to be pornographic or not by the periodicity-based video decision algorithm. Before fusion, two support vector machines are respectively applied for the audio-vocabulary-based and visual-features-based methods. To fuse their results, a keyframe is selected from each EEU in terms of the beginning and ending positions, and then an integrated weighted scheme and a periodicity-based video decision algorithm are adopted to yield final detection results. Experimental results show that our approach outperforms the traditional one which is only based on visual features, and achieves satisfactory performance. The true positive rate achieves 94.44% while the false positive rate is 9.76%.

► We put forward a novel representation method of audio semantics based on EEU and BoW.
► A periodicity-based audio segmentation algorithm is proposed.
► A periodicity-based decision algorithm is presented.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 31, February 2014, Pages 69–76

نویسندگان

Yizhi Liu, Ying Yang, Hongtao Xie, Sheng Tang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Fusing audio vocabulary with visual features for pornographic video detection

دسترسی سریع

ارتباط

English Website