Improvement Comparison of Different Lattice-based Discriminative Training Methods in Chinese-monolingual and Chinese-English-bilingual Speech Recognition

Article ID	Journal	Published Year	Pages	File Type
694462	Acta Automatica Sinica	2012	7 Pages	PDF

Abstract

Discriminative training approaches such as minimum phone error (MPE), feature minimum phone error (fMPE) and boosted maximum mutual information (BMMI) have brought remarkable improvement to the speech community in recent years, however, much work still remains to be done. This paper investigates the performances of three lattice-based discriminative training methods in detail, and does a comparison of different I-smoothing methods to obtain more robust models in the Chinese-monolingual situation. The complementary properties of the different discriminative training methods are explored to perform a system combination by recognizer output voting error reduction (ROVER). Although discriminative training is normally used in monolingual systems, this paper systematically investigates its use for bilingual speech recognition, including MPE, fMPE, and BMMI. A new method is proposed to generate significantly better lattices for training the bilingual model, and complementary discriminative training models are also explored to get the best ROVER performance in the bilingual situation. Experimental results show that all forms of discriminative training can reduce the word error rate in both monolingual and bilingual systems, and that combining complementary discriminative training methods can improve the performance significantly.