Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4977781 | Speech Communication | 2017 | 13 Pages |
Abstract
This work reports a method of Consonant-Vowel (CV) unit recognition by detecting the Dominant Aperiodic component Regions (DARs) and by predicting the Duration of Transition Regions (DTRs) in speech. DAR detection is performed using complementary information from source and vocal tract. While source information is extracted using sub-fundamental frequency filtering of speech, vocal tract information is extracted using a) Dominant Resonant Frequency (DRF) and b) High to Low Frequency component Ratio (HLFR), computed from Hilbert envelope of Numerator Group Delay (HNGD) spectrum of zero-time windowed signal. The DTR is predicted by using vocal tract constriction information. Subsequently, detected DARs and predicted DTRs are compared with manually marked regions and finally used for CV unit recognition of Indian languages. Conventionally, CV unit recognition is performed by anchoring the Vowel Onset Point (VOP) and assuming fixed durations for transition and consonant regions on either side of the VOP. However, in speech, the duration of transition and consonantal regions vary depending on the type of consonants and vowels. In the proposed method, the use of dynamic values for consonant duration and transition regions have resulted in better consonant recognition improving CV unit recognition.
Related Topics
Physical Sciences and Engineering
Computer Science
Signal Processing
Authors
Biswajit D. Sarma, S.R. Mahadeva Prasanna, Priyankoo Sarmah,