Abstract
Language models are now commonly used by researchers, industry, and anyone interested. However, language models of all sizes and types are primarily developed for the English language while efforts on other languages lag behind. This dissertation explores how well non-English language models perform and how to adapt models for higher resource languages to lower-resource languages. With a focus on Dutch, we show high cross-lingual performance. Moreover, we find that language models can be adapted to other higher-resource languages (Dutch and Italian) or to low-resource languages (Gronings and Frisian) with minimal extra training. Finally, we compare how language similarity affects cross-lingual performance and find previously found low performance can be caused by the use of English as a source language.
| Original language | English |
|---|---|
| Qualification | Doctor of Philosophy |
| Awarding Institution |
|
| Supervisors/Advisors |
|
| Award date | 6-Jun-2024 |
| Place of Publication | [Groningen] |
| Publisher | |
| DOIs | |
| Publication status | Published - 2024 |