کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
565952 875876 2012 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient training of discriminative language models by sample selection
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Efficient training of discriminative language models by sample selection
چکیده انگلیسی

This paper focuses on discriminative language models (DLMs) for large vocabulary speech recognition tasks. To train such models, we usually use a large number of hypotheses generated for each utterance by a speech recognizer, namely an n-best list or a lattice. Since the data size is large, we usually need a high-end machine or a large-scale distributed computation system consisting of many computers for model training. However, it is still unclear whether or not such a large number of sentence hypotheses are necessary. Furthermore, we do not know which kinds of sentences are necessary. In this paper, we show that we can generate a high performance model using small subsets of the n-best lists by choosing samples properly, i.e., we describe a sample selection method for DLMs. Sample selection reduces the memory footprint needed for holding training samples and allows us to train models in a standard machine. Furthermore, it enables us to generate a highly accurate model using various types of features. Specifically, experimental results show that even training using two samples in each list can provide an accurate model with a small memory footprint.


► We show the effect of sample selection in discriminative language model training.
► The effect is investigated in speech recognition with various training approaches.
► The accuracy improves with a small set of selected samples in some training methods.
► Sample selection dramatically reduces the memory footprint needed for training. A variety of features is effective and easy to use in this sample selection scenario.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 6, July 2012, Pages 791–800
نویسندگان
, , ,