کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
534963 | 870309 | 2016 | 7 صفحه PDF | دانلود رایگان |
• An evolutionary algorithm for the optimisation of filter banks.
• Filter banks more appropriate to stress and emotion classification were obtained.
• New speech features were obtained through optimised filter banks.
• The optimised features improved the results in stressed speech classification.
Mel-frequency cepstral coefficients introduced biologically-inspired features into speech technology, becoming the most commonly used representation for speech, speaker and emotion recognition, and even for applications in music. While this representation is quite popular, it is ambitious to assume that it would provide the best results for every application, as it is not designed for each specific objective. This work proposes a methodology to learn a speech representation from data by optimising a filter bank, in order to improve results in the classification of stressed speech. Since population-based metaheuristics have proved successful in related applications, an evolutionary algorithm is designed to search for a filter bank that maximises the classification accuracy. For the codification, spline functions are used to shape the filter banks, which allows reducing the number of parameters to optimise. The filter banks obtained with the proposed methodology improve the results in stressed and emotional speech classification.
Journal: Pattern Recognition Letters - Volume 84, 1 December 2016, Pages 1–7