Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

24 Downloads (Pure)

Abstract

Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed.
Original languageEnglish
Title of host publicationProceedings of the Fifth Conference on Machine Translation (WMT)
PublisherAssociation for Computational Linguistics (ACL)
Pages274-281
Number of pages8
Publication statusPublished - Nov-2020
EventFifth Conference on Machine Translation - Online
Duration: 19-Nov-202020-Nov-2020

Conference

ConferenceFifth Conference on Machine Translation
Abbreviated titleWMT20
Period19/11/202020/11/2020

Cite this