کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
385415 | 660865 | 2011 | 15 صفحه PDF | دانلود رایگان |

In this paper we present our research results towards the detection of violent scenes in movies, employing advanced fusion methodologies, based on learning, knowledge representation and reasoning. Towards this goal, a multi-step approach is followed: initially, automated audio and visual analysis is performed to extract audio and visual cues. Then, two different fusion approaches are deployed: (i) a multimodal one that provides binary decisions on the existence of violence or not, employing machine learning techniques, (ii) an ontological and reasoning one, that combines the audio-visual cues with violence and multimedia ontologies. The latter reasons out not only the existence of violence or not in a video scene, but also the type of violence (fight, screams, gunshots). Both approaches are experimentally tested, validated and compared for the binary decision problem of violence detection. Finally, results for the violence type identification are presented for the ontological fusion approach. For evaluation purposes, a large dataset of real movie data has been populated.
► We extract discriminative audio and visual features, specific for violence detection.
► We employ a meta-classification scheme for fusing audio and visual modalities.
► We introduce a 5-step cross modality ontological/inferencing framework for violence identification.
► We present a first attempt for a complete ontological definition of the movie violence domain.
► Experimentation proves the added value of the ontological approach on higher level of semantics extraction and representation.
Journal: Expert Systems with Applications - Volume 38, Issue 11, October 2011, Pages 14102–14116