Learning context-free grammars using tabular representations

Article ID	Journal	Published Year	Pages	File Type
10360631	Pattern Recognition	2005	12 Pages	PDF

Abstract

We present a novel algorithm using new hypothesis representations for learning context-free grammars from a finite set of positive and negative examples. We propose an efficient hypothesis representation method which consists of a table-like data structure similar to the parse table used in efficient parsing algorithms for context-free grammars such as Cocke-Younger-Kasami algorithm. By employing this representation method, the problem of learning context-free grammars from examples can be reduced to the problem of partitioning the set of nonterminals. We use genetic algorithms for solving this partitioning problem. Further, we incorporate partially structured examples to improve the efficiency of our learning algorithm, where a structured example is represented by a string with some parentheses inserted to indicate the shape of the derivation tree of the unknown grammar. We demonstrate some experimental results using these algorithms and theoretically analyse the completeness of the search space using the tabular method for context-free grammars.

Keywords

Grammatical inference Genetic algorithm Dynamic programming Context-free grammar