Efficient and effective strategies for cross-corpus acoustic emotion recognition

Article ID	Journal	Published Year	Pages	File Type
6864980	Neurocomputing	2018	31 Pages	PDF

Abstract

An important research direction in speech technology is robust cross-corpus and cross-language emotion recognition. In this paper, we propose computationally efficient and performance effective feature normalization strategies for the challenging task of cross-corpus acoustic emotion recognition. We particularly deploy a cascaded normalization approach, combining linear speaker level, nonlinear value level and feature vector level normalization to minimize speaker- and corpus-related effects as well as to maximize class separability with linear kernel classifiers. We use extreme learning machine classifiers on five corpora representing five languages from different families, namely Danish, English, German, Russian and Turkish. Using a standard set of suprasegmental features, the proposed normalization strategies show superior performance compared to benchmark normalization approaches commonly used in the literature.

Keywords

Extreme learning machines