Samenvatting
This paper investigates very low resource language model pretraining, when less than 100 thousand sentences are available. We find that, in very low-resource scenarios, statistical n-gram language models outperform state-of-the-art neural models. Our experiments show that this is mainly due to the focus of the former on a local context. As such, we introduce three methods to improve a neural model’s performance in the low-resource setting, finding that limiting the model’s self-attention is the most effective one, improving on downstream tasks such as NLI and POS tagging by up to 5% for the languages we test on: English, Hindi, and Turkish.
Originele taal-2 | English |
---|---|
Titel | Proceedings of the 18th International Conference on Natural Language Processing (ICON) |
Redacteuren | Sivaji Bandyopadhyay, Sobha Lalitha Devi, Pushpak Bhattacharyya |
Uitgeverij | Association for Computational Linguistics (ACL) |
Pagina's | 86-92 |
Aantal pagina's | 7 |
Status | Published - dec.-2021 |