کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
380327 | 1437435 | 2015 | 9 صفحه PDF | دانلود رایگان |
Nonlinear properties of a complex signal can be represented in reconstructed phase space (RPS). Previously, researchers have developed RPS-based feature extraction approaches to capture nonlinear properties. Typically, these approaches are more computationally demanding – higher run-time – and less accurate than traditional techniques such as Mel-frequency cepstral coefficients (MFCCs) that fail to capture nonlinear properties of signals. To overcome these issues, we propose a new RPS-based feature extraction approach that is based on a previously reported approach. The proposed approach calculates the similarities between the embedded speech signals and a set of predefined speech attractor models in the RPS, and uses the similarities as a set of proper input features for a final phonetic classifier. A set of Gaussian mixture models (GMMs) is trained to represent the variety of all phoneme attractors in the RPS. Using the developed GMMs, for each embedded out-sample speech signal, a feature vector is calculated that consists of the Log-likelihoods. Then, an MLP-based classifier is used to estimate posterior probabilities for the phoneme classes. To test the performance of the proposed approach, we apply the approach to a Persian speech corpus (i.e., FARSDAT). Results show 1.89% absolute classification accuracy improvement in comparison to performance of a baseline system that exploits MFCC features. Combining different classifiers that use the proposed RPS-based features and MFCC features, the classifier gain the highest accuracy of 68.85% phoneme classification rate, with absolute accuracy improvements of 4.78% against a baseline system.
Journal: Engineering Applications of Artificial Intelligence - Volume 44, September 2015, Pages 1–9