Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training

Christian Roest, Lukas Edman, Gosse Minnema, Kevin Kelly, Jennifer Spenader, Antonio Toral

OnderzoeksoutputAcademicpeer review

11 Citaten (Scopus)
115 Downloads (Pure)

Samenvatting

Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed.
Originele taal-2English
TitelProceedings of the Fifth Conference on Machine Translation (WMT)
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's274-281
Aantal pagina's8
StatusPublished - nov.-2020
EvenementFifth Conference on Machine Translation - Online
Duur: 19-nov.-202020-nov.-2020

Conference

ConferenceFifth Conference on Machine Translation
Verkorte titelWMT20
Periode19/11/202020/11/2020

Vingerafdruk

Duik in de onderzoeksthema's van 'Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training'. Samen vormen ze een unieke vingerafdruk.

Citeer dit