Article ID Journal Published Year Pages File Type
6864733 Neurocomputing 2018 9 Pages PDF
Abstract
Speech recognition systems exhibit performance degradation due to variability in speech caused by the accents or dialects of speakers. This can be overcome by correctly identifying the accent or dialect of the speaker and using accent or dialect information to adapt speech recognition systems. In this paper, we apply extreme learning machines (ELMs) and support vector machines (SVMs) to the problem of accent/dialect classification on the TIMIT dataset. We used Mel frequency cepstrum coefficients (MFCCs) and the normalized energy parameter along with their first and second derivatives as raw features for training ELMs and SVMs. A weighted accent classification algorithm is proposed that uses a novel architecture to classify North American accents into seven groups. Using this algorithm, we obtained a classification accuracy of 77.88% using ELMs, which to our knowledge, is the best result reported for accent classification on the TIMIT dataset. We also compared the performance of ELMs with SVMs as classifiers for our weighted accent classification algorithm and with multi-class classification using ELMs or SVMs.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,