Literary-adapted machine translation in a well-resourced language pair: Explorations with More Data and Wider Contexts

OnderzoeksoutputAcademicpeer review

1 Citaat (Scopus)
16 Downloads (Pure)

Samenvatting

Following recent work on literary-adapted machine translation (MT) systems, this paper investigates whether it is worthwhile building such a system for a reasonably well-resourced language pair, English-to-Dutch, for which generic MT systems (e.g. DeepL) are known to be competitive. Specifically, a system is presented that uses considerably more in-domain training data (novels) than in previous work, as well as an exploration of using longer instances than isolated sentence pairs (i.e. document-level MT). A sizable test set of 31 English-language novels and their published Dutch human translations is evaluated. The evaluation is multidimensional, including automatic MT evaluation metrics, error- and survey-based human evaluation, as well as quantitative automatic analyses, including the novel use of literariness prediction of translations. The results show that, overall, a literary-adapted system that combines sentence- and document-level information performs slightly better than DeepL (4% higher COMET score), with the edge being wider for genre fiction, while the gains over DeepL are smaller or negative for literary fiction. Code, data (public domain subset), and trained systems are available at https://github.com/antot/lit-mt-en-nl" xmlns:xlink="https://www.w3.org/1999/xlink">https://github.com/antot/lit-mt-en-nl.
Originele taal-2English
TitelComputer-Assisted Literary Translation
RedacteurenAndrew Rothwell, Andy Way, Roy Youdale
UitgeverijRoutledge
Hoofdstuk1
Pagina's27-52
Aantal pagina's26
ISBN van elektronische versie9781003357391
ISBN van geprinte versie9781032413006
DOI's
StatusPublished - 2024

Vingerafdruk

Duik in de onderzoeksthema's van 'Literary-adapted machine translation in a well-resourced language pair: Explorations with More Data and Wider Contexts'. Samen vormen ze een unieke vingerafdruk.

Citeer dit