Consonant-vowel unit recognition using dominant aperiodic and transition region detection

Article ID	Journal	Published Year	Pages	File Type
4977781	Speech Communication	2017	13 Pages	PDF

Abstract

This work reports a method of Consonant-Vowel (CV) unit recognition by detecting the Dominant Aperiodic component Regions (DARs) and by predicting the Duration of Transition Regions (DTRs) in speech. DAR detection is performed using complementary information from source and vocal tract. While source information is extracted using sub-fundamental frequency filtering of speech, vocal tract information is extracted using a) Dominant Resonant Frequency (DRF) and b) High to Low Frequency component Ratio (HLFR), computed from Hilbert envelope of Numerator Group Delay (HNGD) spectrum of zero-time windowed signal. The DTR is predicted by using vocal tract constriction information. Subsequently, detected DARs and predicted DTRs are compared with manually marked regions and finally used for CV unit recognition of Indian languages. Conventionally, CV unit recognition is performed by anchoring the Vowel Onset Point (VOP) and assuming fixed durations for transition and consonant regions on either side of the VOP. However, in speech, the duration of transition and consonantal regions vary depending on the type of consonants and vowels. In the proposed method, the use of dynamic values for consonant duration and transition regions have resulted in better consonant recognition improving CV unit recognition.

Keywords

COR DTR SFF Vowel onset point VTC VTR DAR Transition region