Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder

OnderzoeksoutputAcademicpeer review

39 Downloads (Pure)

Samenvatting

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. It generates weights for adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge, our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best or competitive gain when a mixture of multiple resources is available, while on par with strong baseline in the standard scenario. Hyper-X is also considerably more efficient in terms of parameters and resources compared to methods that train separate adapters. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages, showing the versatility of our approach beyond zero-shot transfer.
Originele taal-2English
TitelProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
RedacteurenYoav Goldberg, Zornitsa Kozareva, Yue Zhang
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's7934–7949
Aantal pagina's16
StatusPublished - 2022

Vingerafdruk

Duik in de onderzoeksthema's van 'Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer'. Samen vormen ze een unieke vingerafdruk.

Citeer dit