کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558222 | 1451691 | 2016 | 17 صفحه PDF | دانلود رایگان |
• We analyze why semi-supervised backoff language modeling performs poorly.
• We motivate MAP adaptation of a log-linear language model.
• We use automatic transcripts as a prior for language model estimation.
• We show consistent reduction in WER across a range of low-resource conditions.
Many under-resourced languages such as Arabic diglossia or Hindi sub-dialects do not have sufficient in-domain text to build strong language models for use with automatic speech recognition (ASR). Semi-supervised language modeling uses a speech-to-text system to produce automatic transcripts from a large amount of in-domain audio typically to augment a small amount of manual transcripts. In contrast to the success of semi-supervised acoustic modeling, conventional language modeling techniques have provided only modest gains. This paper first explains the limitations of back-off language models due to their dependence on long-span n-grams, which are difficult to accurately estimate from automatic transcripts. From this analysis, we motivate a more robust use of the automatic counts as a prior over the estimated parameters of a log-linear language model. We demonstrate consistent gains for semi-supervised language models across a range of low-resource conditions.
Journal: Computer Speech & Language - Volume 36, March 2016, Pages 93–109