Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation

Filip Klubička, Antonio Toral Ruiz, M. Víctor Sánchez-Cartagena

    Onderzoeksoutput: ArticleAcademicpeer review

    415 Downloads (Pure)

    Samenvatting

    We compare three approaches to statistical machine translation (pure phrase-based, fac-
    tored phrase-based and neural) by performing a fine-grained manual evaluation via error an-
    notation of the systems’ outputs. The error types in our annotation are compliant with the
    multidimensional quality metrics (MQM), and the annotation is performed by two annotators.
    Inter-annotator agreement is high for such a task, and results show that the best performing
    system (neural) reduces the errors produced by the worst system (phrase-based) by 54%.
    Originele taal-2English
    Pagina's (van-tot)121-132
    Aantal pagina's12
    TijdschriftThe Prague Bulletin of Mathematical Linguistics
    Volume108
    Nummer van het tijdschrift1
    DOI's
    StatusPublished - 2017

    Vingerafdruk

    Duik in de onderzoeksthema's van 'Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit