A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature

Andreas van Cranenburgh*, Esther Ploeger, Frank van den Berg, Remi Thüss

*Corresponding author voor dit werk

    OnderzoeksoutputAcademicpeer review

    4 Citaten (Scopus)
    130 Downloads (Pure)

    Samenvatting

    We introduce a modular, hybrid coreference resolution system that extends a rule-based baseline with three neural classifiers for the subtasks mention detection, mention attributes (gender, animacy, number), and pronoun resolution. The classifiers substantially increase coreference performance in our experiments with Dutch literature across all metrics on the development set: mention detection, LEA, CoNLL, and especially pronoun accuracy. However, on the test set, the best results are obtained with rule-based pronoun resolution. This inconsistent result highlights that the rule-based system is still a strong baseline, and more work is needed to improve pronoun resolution robustly for this dataset. While end-to-end neural systems require no feature engineering and achieve excellent performance in standard benchmarks with large training sets, our simple hybrid system scales well to long document coreference (>10k words) and attains superior results in our experiments on literature.
    Originele taal-2English
    TitelProceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference
    RedacteurenMaciej Ogrodniczuk, Sameer Pradhan, Massimo Poesio, Yulia Grishina, Vincent Ng
    UitgeverijAssociation for Computational Linguistics (ACL)
    Pagina's47-56
    Aantal pagina's10
    StatusPublished - nov.-2021

    Vingerafdruk

    Duik in de onderzoeksthema's van 'A Hybrid Rule-Based and Neural Coreference Resolution System with an Evaluation on Dutch Literature'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit