کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567460 876080 2012 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Classification of emotional speech using 3DEC hierarchical classifier
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Classification of emotional speech using 3DEC hierarchical classifier
چکیده انگلیسی

The recognition of emotion from speech acoustics is an important problem in human–machine interaction, with many potential applications. In this paper, we first compare four ways to extend binary support vector machines (SVMs) to multiclass classification for recognising emotions from speech—namely two standard SVM schemes (one-versus-one and one-versus-rest) and two other methods (DAG and UDT) that form a hierarchy of classifiers, each making a distinct binary decision about class membership. These are trained and tested using 6552 features per speech sample extracted from three databases of acted emotional speech (DES, Berlin and Serbian) and a database of spontaneous speech (FAU Aibo Emotion Corpus) using the OpenEAR toolkit. Analysis of the errors made by these classifiers leads us to apply non-metric multi-dimensional scaling (NMDS) to produce a compact (two-dimensional) representation of the data suitable for guiding the choice of decision hierarchy. This representation can be interpreted in terms of the well-known valence-arousal model of emotion. We find that this model does not give a particularly good fit to the data: although the arousal dimension can be identified easily, valence is not well represented in the transformed data. We describe a new hierarchical classification technique whose structure is based on NMDS, which we call Data-Driven Dimensional Emotion Classification (3DEC). This new method is compared with the best of the four classifiers studied earlier and a state-of-the-art classification method on all four databases. We find no significant difference between these three approaches with respect to speaker-dependent performance. However, for the much more interesting and important case of speaker-independent emotion classification, 3DEC significantly outperforms the competitors.


► Compare 4 extensions of binary SVMs to multiclass classification of speech emotions.
► Use large number of acoustic features (6552) and four speech databases.
► Apply multidimensional scaling (NMDS) to the confusions as a test of valence-arousal model.
► Describe a new hierarchical classifier (3DEC) whose structure is based on NMDS.
► 3DEC significantly outperforms the competitors on speaker-independent emotion recognition.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 7, September 2012, Pages 903–916
نویسندگان
, ,