دانلود رایگان مقاله: پیش بینی اعتبارات عاطفی ابعاد ناهمزمان از داده های صوتی و تصویری

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6941072	870212	2015	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Prediction of asynchronous dimensional emotion ratings from audiovisual and physiological data

ترجمه فارسی عنوان

پیش بینی اعتبارات عاطفی ابعاد ناهمزمان از داده های صوتی و تصویری

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

مفهوم حافظه طولانی مدت حافظه مکرر شبکه های عصبی، داده های صوتی و تصویری، تجزیه و تحلیل اثر مستمر، یادگیری چند کاره ویژگی های قطعنامه های چندگانه استخراج، تلفیق چندجملهای،

Multimodal fusion - تلفیق چندجملهای Multi-task learning - یادگیری چند کاره

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش مقاله

پیش بینی اعتبارات عاطفی ابعاد ناهمزمان از داده های صوتی و تصویری

چکیده انگلیسی

Automatic emotion recognition systems based on supervised machine learning require reliable annotation of affective behaviours to build useful models. Whereas the dimensional approach is getting more and more popular for rating affective behaviours in continuous time domains, e.g., arousal and valence, methodologies to take into account reaction lags of the human raters are still rare. We therefore investigate the relevance of using machine learning algorithms able to integrate contextual information in the modelling, like long short-term memory recurrent neural networks do, to automatically predict emotion from several (asynchronous) raters in continuous time domains, i.e., arousal and valence. Evaluations are performed on the recently proposed RECOLA multimodal database (27 subjects, 5Â min of data and six raters for each), which includes audio, video, and physiological (ECG, EDA) data. In fact, studies uniting audiovisual and physiological information are still very rare. Features are extracted with various window sizes for each modality and performance for the automatic emotion prediction is compared for both different architectures of neural networks and fusion approaches (feature-level/decision-level). The results show that: (i) LSTM network can deal with (asynchronous) dependencies found between continuous ratings of emotion with video data, (ii) the prediction of the emotional valence requires longer analysis window than for arousal and (iii) a decision-level fusion leads to better performance than a feature-level fusion. The best performance (concordance correlation coefficient) for the multimodal emotion prediction is 0.804 for arousal and 0.528 for valence.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 66, 15 November 2015, Pages 22-30

نویسندگان

Fabien Ringeval, Florian Eyben, Eleni Kroupi, Anil Yuce, Jean-Philippe Thiran, Touradj Ebrahimi, Denis Lalanne, Björn Schuller,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : پیش بینی اعتبارات عاطفی ابعاد ناهمزمان از داده های صوتی و تصویری

دسترسی سریع

ارتباط

English Website