Samenvatting
We reassess the claims of human parity and super-human performance made at the news shared task of WMT2019 for three translation directions: English > German, English > Russian and German >English. First we identify three potential issues in the human evaluation of that shared task: (i) the limited amount of intersentential context available, (ii) the limited translation proficiency of the evaluators and (iii) the use of a reference translation. We then conduct a modified evaluation taking these issues into account. Our results indicate that all the claims of human parity and super-human performance made at WMT2019 should be refuted, except the claim of human parity for English > German. Based on our findings, we put forward a set of recommendations and open questions for future assessments of human parity in machine translation.
Originele taal-2 | English |
---|---|
Titel | Proceedings of the 22nd Annual Conference of the European Association for Machine Translation |
Plaats van productie | Lisboa, Portugal |
Uitgeverij | European Association for Machine Translation |
Pagina's | 185-194 |
Aantal pagina's | 10 |
Status | Published - 2020 |
Evenement | Annual Conference of the European Association for Machine Translation - Online, Portugal Duur: 3-nov.-2020 → 5-nov.-2020 Congresnummer: 22 |
Conference
Conference | Annual Conference of the European Association for Machine Translation |
---|---|
Land/Regio | Portugal |
Periode | 03/11/2020 → 05/11/2020 |