Abstract
NLP technologies are uneven for the world's languages as the state-of-the-art models are only available for a handful of them. This is because developing such a language-specific model needs rich monolingual resources and labelled datasets which are partly or completely missing for many languages that are low-resource.This inequality in multilingual resources and limited capabilities of existing NLP models, especially for low-resource languages, drive us to explore more sophisticated solutions. This thesis presents a unified approach that consists of a set of novel methods within the context of multilingual learning and adaptation to move current NLP technologies beyond a small-set of resource-rich languages. We evaluate these techniques by using particular tasks and targeted use cases such as zero-shot or unsupervised learning scenarios. We believe that our findings can be a base for further analysis and our techniques can be extended to the billion-scale language models. While neural language models become progressively larger in size, more effective and efficient adaptation methods can enable NLP technologies to be more fair and inclusive.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 30-Mar-2023 |
Place of Publication | [Groningen] |
Publisher | |
DOIs | |
Publication status | Published - 2023 |