Article ID Journal Published Year Pages File Type
526903 Image and Vision Computing 2013 11 Pages PDF
Abstract

The explosion of user-generated, untagged multimedia data in recent years, generates a strong need for efficient search and retrieval of this data. The predominant method for content-based tagging is through slow, labor-intensive manual annotation. Consequently, automatic tagging is currently a subject of intensive research. However, it is clear that the process will not be fully automated in the foreseeable future. We propose to involve the user and investigate methods for implicit tagging, wherein users' responses to the interaction with the multimedia content are analyzed in order to generate descriptive tags.Here, we present a multi-modal approach that analyses both facial expressions and electroencephalography (EEG) signals for the generation of affective tags. We perform classification and regression in the valence-arousal space and present results for both feature-level and decision-level fusion. We demonstrate improvement in the results when using both modalities, suggesting the modalities contain complementary information.

► In this work, we fuse the modalities of EEG and facial expressions for implicit affective tagging. ► We use the recordings of 24 participants each watching 20 videos designed to elicit emotions. ► We perform classification of the recorded EEG signals and facial expressions in to arousal/valence/control dimensions. ► We investigate methods for feature-level and decision-level fusion and demonstrate increased performance.► We demonstrate that aggregating the generated affect estimates over many subjects produces reliable affect tags.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,