UmBERTo-MTSA@ AcCompl-It: Improving Complexity and Acceptability Prediction with Multi-task Learning on Self-Supervised Annotations

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

This work describes a self-supervised data augmentation approach used to improve learning models' performances when only a moderate amount of labeled data is available. Multiple copies of the original model are initially trained on the downstream task. Their predictions are then used to annotate a large set of unlabeled examples. Finally, multi-task training is performed on the parallel annotations of the resulting training set, and final scores are obtained by averaging annotator-specific head predictions. Neural language models are fine-tuned using this procedure in the context of the AcCompl-it shared task at EVALITA 2020, obtaining considerable improvements in prediction quality.
Original languageEnglish
Title of host publicationProceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020)
EditorsValerio Basile, Danilo Croce, Maria Di Maro, Lucia Passaro
Place of PublicationOnline
PublisherCEUR Workshop Proceedings (CEUR-WS.org)
Publication statusPublished - 17-Dec-2020
Externally publishedYes
EventEvaluation Campaign of Natural Language Processing and Speech Tools for Italian - Online
Duration: 17-Dec-2020 → …
Conference number: 7

Workshop

WorkshopEvaluation Campaign of Natural Language Processing and Speech Tools for Italian
Abbreviated titleEVALITA 2020
Period17/12/2020 → …

Keywords

  • natural language processing
  • deep learning
  • self-training
  • neural language models
  • multi-task learning
  • linguistic complexity
  • linguistic acceptability

Cite this