کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1138328 | 1489209 | 2007 | 10 صفحه PDF | دانلود رایگان |

The problem of subgrammar extraction is precisely defined in the following way: Given a grammar GG and a set of words WW, find a smallest subgrammar of GG that accepts the same set of sentences from W∗W∗ as GG, and for each of them produces the same parse trees. In practical Natural Language Processing applications, the set of words WW is obtained from the text unit. There are practical motivations for doing this operation “just-in-time”, i.e. just before processing the text; hence it is required that this operation be done in an automatic and efficient way. After defining the problem in the general framework, we discuss the problem for context-free grammars (CFG), and give an efficient algorithm for it. We prove that finding the smallest subgrammar for HPSGs is an NP-hard problem, and give an efficient algorithm that solves an easier, approximate version of the problem. We also discuss how the algorithm can be efficiently implemented.
Journal: Mathematical and Computer Modelling - Volume 45, Issues 3–4, February 2007, Pages 394–403