کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
558300 874892 2014 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Shape-based modeling of the fundamental frequency contour for emotion detection in speech
ترجمه فارسی عنوان
مدل سازی شکل مبتنی بر کانون فرکانس اساسی برای شناسایی احساسات در گفتار
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We propose neutral reference models to detect emotions in the fundamental frequency.
• A novel scheme based on functional data analysis is presented.
• The proposed approach achieves higher accuracies than the state-of-the-art method.
• The scheme is also employed to the to detect the most emotionally salient segments.
• The approach is validated at the sub-sentence level by employing a natural database.

This paper proposes the use of neutral reference models to detect local emotional prominence in the fundamental frequency. A novel approach based on functional data analysis (FDA) is presented, which aims to capture the intrinsic variability of F0 contours. The neutral models are represented by a basis of functions and the testing F0 contour is characterized by the projections onto that basis. For a given F0 contour, we estimate the functional principal component analysis (PCA) projections, which are used as features for emotion detection. The approach is evaluated with lexicon-dependent (i.e., one functional PCA basis per sentence) and lexicon-independent (i.e., a single functional PCA basis across sentences) models. The experimental results show that the proposed system can lead to accuracies as high as 75.8% in binary emotion classification, which is 6.2% higher than the accuracy achieved by a benchmark system trained with global F0 statistics. The approach can be implemented at sub-sentence level (e.g., 0.5 s segments), facilitating the detection of localized emotional information conveyed within the sentence. The approach is validated with the SEMAINE database, which is a spontaneous corpus. The results indicate that the proposed scheme can be effectively employed in real applications to detect emotional speech.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 28, Issue 1, January 2014, Pages 278–294
نویسندگان
, , ,