Article ID Journal Published Year Pages File Type
535758 Pattern Recognition Letters 2013 9 Pages PDF
Abstract

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study.

► We have developed a automatic speech recognition system in Bengali for aged population. ► We have analyzed phoneme and word recognition performance of aged people employing several acoustic model. ► We have combined speaker normalization and model adaptation techniques to improve recognition performance. ► We have find out more affected phone which motivate us to incorporate these finding in acoustic model creation in future.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,