کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
565942 | 875872 | 2012 | 17 صفحه PDF | دانلود رایگان |

Unvoiced stops are rapidly varying sounds with acoustic cues to place identity linked to the temporal dynamics. Neurophysiological studies have indicated the importance of joint spectro-temporal processing in the human perception of stops. In this study, two distinct approaches to modeling the spectro-temporal envelope of unvoiced stop phone segments are investigated with a view to obtaining a low-dimensional feature vector for automatic place classification. Classification accuracies on the TIMIT database and a Marathi words dataset show the overall superiority of classifier combination of polynomial surface coefficients and 2D-DCT. A comparison of performance with published results on the place classification of stops revealed that the proposed spectro-temporal feature systems improve upon the best previous systems’ performances. The results indicate that joint spectro-temporal features may be usefully incorporated in hierarchical phone classifiers based on diverse class-specific features.
► Segment based classification of unvoiced stops in English and Marathi is addressed.
► Localized discrete cosine transform coefficients are evaluated for the task.
► Joint spectro-temporal modeling via bivariate polynomials is proposed.
► The best so far published results on the same task are improved upon.
Journal: Speech Communication - Volume 54, Issue 10, December 2012, Pages 1104–1120