Multilingual Learning and Adaptation for Neural Language Models

Ahmet Üstün

Research output: ThesisThesis fully internal (DIV)

37 Downloads (Pure)

Abstract

NLP technologies are uneven for the world's languages as the state-of-the-art models are only available for a handful of them. This is because developing such a language-specific model needs rich monolingual resources and labelled datasets which are partly or completely missing for many languages that are low-resource.This inequality in multilingual resources and limited capabilities of existing NLP models, especially for low-resource languages, drive us to explore more sophisticated solutions. This thesis presents a unified approach that consists of a set of novel methods within the context of multilingual learning and adaptation to move current NLP technologies beyond a small-set of resource-rich languages. We evaluate these techniques by using particular tasks and targeted use cases such as zero-shot or unsupervised learning scenarios. We believe that our findings can be a base for further analysis and our techniques can be extended to the billion-scale language models. While neural language models become progressively larger in size, more effective and efficient adaptation methods can enable NLP technologies to be more fair and inclusive.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • University of Groningen
Supervisors/Advisors
  • Bouma, Gosse, Supervisor
  • van Noord, Gertjan, Supervisor
  • Bisazza, Arianna, Co-supervisor
Award date30-Mar-2023
Place of Publication[Groningen]
Publisher
DOIs
Publication statusPublished - 2023

Cite this