کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10368608 874964 2005 34 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Language modeling with probabilistic left corner parsing
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Language modeling with probabilistic left corner parsing
چکیده انگلیسی
We present a novel language model, suitable for large-vocabulary continuous speech recognition, based on parsing with a probabilistic left corner grammar (PLCG). The PLCG probabilities are conditioned on local and non-local features of the partial parse tree, and some of these features are lexical. They are not derived from another stochastic grammar, but directly induced from a treebank, a corpus of text sentences, annotated with parse trees. A context-enriched constituent represents all partial parse trees that are equivalent with respect to the probability of the next parse move. For computational efficiency the parsing problem is represented as a traversal through a compact stochastic network of constituents connected by PLCG moves. The efficiency of the algorithm is due to the fact that the network consists of recursively nested, shared subnetworks. The PLCG-based language model results from accumulating the probabilities of all (partial) paths through this network. Next word probabilities can be computed synchronously with the probabilistic left corner parsing algorithm in one single pass from left to right. They are guaranteed to be normalized, even when pruning less likely paths. Finally, it is shown experimentally that the PLCG-based language model is a competitive alternative to other syntax-based language models, both in efficiency and accuracy.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 19, Issue 2, April 2005, Pages 171-204
نویسندگان
, ,