Article ID Journal Published Year Pages File Type
6951366 Biomedical Signal Processing and Control 2015 11 Pages PDF
Abstract
Features greatly influence the results of speech emotion recognition, among which Mel-frequency Cepstral Coefficients (MFCC) is the most commonly used in speech emotion. However, MFCC does not consider both the relationship among neighbor coefficients of Mel filters of a frame and the relationship among coefficients of Mel filters of neighbor frames, which possibly leads to lose many useful features from spectrogram. This paper presents novel weighted spectral features based on Local Hu moments. The idea is motivated by that the energy on spectrogram would drastically vary with some emotion types such as angry and happy, while it would slightly change with other emotion types such as sadness and fear. This phenomenon would affect the local energy distribution of spectrogram in both time axis and frequency axis of spectrogram. To describe local energy distribution of spectrogram, Hu moments computed from local regions of spectrogram are used, as Hu moments can evaluate the degree how the energy is concentrated to the center of energy gravity of local region of spectrogram and can significantly vary with the speech emotion types. The conducted experiments validate the proposed features in terms of the effectiveness of the speech emotion recognition.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,