کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
563116 875471 2013 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Use of contexts in language model interpolation and adaptation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Use of contexts in language model interpolation and adaptation
چکیده انگلیسی

Language models (LMs) are often constructed by building multiple individual component models that are combined using context independent interpolation weights. By tuning these weights, using either perplexity or discriminative approaches, it is possible to adapt LMs to a particular task. This paper investigates the use of context dependent weighting in both interpolation and test-time adaptation of language models. Depending on the previous word contexts, a discrete history weighting function is used to adjust the contribution from each component model. As this dramatically increases the number of parameters to estimate, robust weight estimation schemes are required. Several approaches are described in this paper. The first approach is based on MAP estimation where interpolation weights of lower order contexts are used as smoothing priors. The second approach uses training data to ensure robust estimation of LM interpolation weights. This can also serve as a smoothing prior for MAP adaptation. A normalized perplexity metric is proposed to handle the bias of the standard perplexity criterion to corpus size. A range of schemes to combine weight information obtained from training data and test data hypotheses are also proposed to improve robustness during context dependent LM adaptation. In addition, a minimum Bayes’ risk (MBR) based discriminative training scheme is also proposed. An efficient weighted finite state transducer (WFST) decoding algorithm for context dependent interpolation is also presented. The proposed technique was evaluated using a state-of-the-art Mandarin Chinese broadcast speech transcription task. Character error rate (CER) reductions up to 7.3% relative were obtained as well as consistent perplexity improvements.


► Context dependent language model interpolation and adaptation.
► Robust weight estimation schemes proposed.
► Normalized perplexity metric for LM interpolation.
► Discriminative training schemes studied.
► Significant error rate reduction of 7.3% relative.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 27, Issue 1, January 2013, Pages 301–321
نویسندگان
, , ,