N-gram Parsing for Jointly Training a Discriminative Constituency Parser

Authors: Arda Çelebi and Arzucan Özgür

Polibits, Vol. 47, pp. 5-12, 2013.

Abstract: Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n-gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n-gram parser derived from a baseline full discriminative constituency parser and analyze the characteristics of the generated n-gram trees for various values of n. Since the produced n-gram trees are in general smaller and less complex compared to full parse trees, it is likely that n-gram parsers are more robust compared to full parsers. Therefore, we use n-gram parsing to boost the accuracy of a full discriminative constituency parser in a hierarchical joint learning setup. Our results show that the full parser jointly trained with an n-gram parser performs statistically significantly better than our baseline full parser on the English Penn Treebank test corpus.

Keywords: Constituency parsing, n-gram parsing, discriminative learning, hierarchical joint learning

PDF: N-gram Parsing for Jointly Training a Discriminative Constituency Parser
PDF: N-gram Parsing for Jointly Training a Discriminative Constituency Parser