Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
565952 | Speech Communication | 2012 | 10 Pages |
This paper focuses on discriminative language models (DLMs) for large vocabulary speech recognition tasks. To train such models, we usually use a large number of hypotheses generated for each utterance by a speech recognizer, namely an n-best list or a lattice. Since the data size is large, we usually need a high-end machine or a large-scale distributed computation system consisting of many computers for model training. However, it is still unclear whether or not such a large number of sentence hypotheses are necessary. Furthermore, we do not know which kinds of sentences are necessary. In this paper, we show that we can generate a high performance model using small subsets of the n-best lists by choosing samples properly, i.e., we describe a sample selection method for DLMs. Sample selection reduces the memory footprint needed for holding training samples and allows us to train models in a standard machine. Furthermore, it enables us to generate a highly accurate model using various types of features. Specifically, experimental results show that even training using two samples in each list can provide an accurate model with a small memory footprint.
► We show the effect of sample selection in discriminative language model training. ► The effect is investigated in speech recognition with various training approaches. ► The accuracy improves with a small set of selected samples in some training methods. ► Sample selection dramatically reduces the memory footprint needed for training. A variety of features is effective and easy to use in this sample selection scenario.