Abstract
Previous work on treebank parsing with discontinuous constituents using Linear
Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity, but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead, we introduce a technique which removes this length restriction, while maintaining a respectable accuracy. The resulting parser has been appliedto a discontinuous treebank with favorableresults.
Context-Free Rewriting systems (LCFRS) has been limited to sentences of up to 30 words, for reasons of computational complexity. There have been some results on binarizing an LCFRS in a manner that minimizes parsing complexity, but the present work shows that parsing long sentences with such an optimally binarized grammar remains infeasible. Instead, we introduce a technique which removes this length restriction, while maintaining a respectable accuracy. The resulting parser has been appliedto a discontinuous treebank with favorableresults.
Original language | English |
---|---|
Title of host publication | Proceedings of EACL |
Place of Publication | Avignon, France |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 460-470 |
Number of pages | 11 |
Publication status | Published - Apr-2012 |
Externally published | Yes |