Discontinuous Data-Oriented Parsing: A mildly context-sensitive all-fragments grammar

Andreas van Cranenburgh, Remko Scha, Federico Sangati

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

6 Citations (Scopus)
154 Downloads (Pure)

Abstract

Recent advances in parsing technology have made treebank parsing with discontinuous constituents possible, with parser output of competitive quality (Kallmeyer and Maier, 2010). We apply Data-Oriented Parsing (DOP) to a grammar formalism that allows for discontinuous trees (LCFRS). Decisions during parsing are conditioned on all possible fragments, resulting in improved performance.
Despite the fact that both DOP and discontinuity present formidable challenges in terms of computational complexity, the model is reasonably efficient, and surpasses the state of the art in discontinuous parsing.
Original languageEnglish
Title of host publicationoceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Place of PublicationDublin, Ireland
PublisherAssociation for Computational Linguistics (ACL)
Pages34-44
Number of pages11
Publication statusPublished - Oct-2011
Externally publishedYes

Fingerprint

Dive into the research topics of 'Discontinuous Data-Oriented Parsing: A mildly context-sensitive all-fragments grammar'. Together they form a unique fingerprint.

Cite this