Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

Article ID	Journal	Published Year	Pages	File Type
535758	Pattern Recognition Letters	2013	9 Pages	PDF

Abstract

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study.

► We have developed a automatic speech recognition system in Bengali for aged population. ► We have analyzed phoneme and word recognition performance of aged people employing several acoustic model. ► We have combined speaker normalization and model adaptation techniques to improve recognition performance. ► We have find out more affected phone which motivate us to incorporate these finding in acoustic model creation in future.

Keywords

Maximum a posteriori (MAP)