کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425677 685814 2014 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Fusing audio vocabulary with visual features for pornographic video detection
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Fusing audio vocabulary with visual features for pornographic video detection
چکیده انگلیسی

Pornographic video detection based on multimodal fusion is an effective approach for filtering pornography. However, existing methods lack accurate representation of audio semantics and pay little attention to the characteristics of pornographic audios. In this paper, we propose a novel framework of fusing audio vocabulary with visual features for pornographic video detection. The novelty of our approach lies in three aspects: an audio semantics representation method based on an energy envelope unit (EEU) and bag-of-words (BoW), a periodicity-based audio segmentation algorithm, and a periodicity-based video decision algorithm. The first one, named the EEU+BoW representation method, is proposed to describe the audio semantics via an audio vocabulary. The audio vocabulary is constructed by k-means clustering of EEUs. The latter two aspects echo with each other to make full use of the periodicities in pornographic audios. Using the periodicity-based audio segmentation algorithm, audio streams are divided into EEU sequences. After these EEUs are classified, videos are judged to be pornographic or not by the periodicity-based video decision algorithm. Before fusion, two support vector machines are respectively applied for the audio-vocabulary-based and visual-features-based methods. To fuse their results, a keyframe is selected from each EEU in terms of the beginning and ending positions, and then an integrated weighted scheme and a periodicity-based video decision algorithm are adopted to yield final detection results. Experimental results show that our approach outperforms the traditional one which is only based on visual features, and achieves satisfactory performance. The true positive rate achieves 94.44% while the false positive rate is 9.76%.


► We put forward a novel representation method of audio semantics based on EEU and BoW.
► A periodicity-based audio segmentation algorithm is proposed.
► A periodicity-based decision algorithm is presented.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 31, February 2014, Pages 69–76
نویسندگان
, , , ,