That Looks Hard: Characterizing Linguistic Complexity in Humans and Language Models

Gabriele Sarti*, Dominique Brunato, Felice Dell'Orletta

*Bijbehorende auteur voor dit werk

OnderzoeksoutputAcademicpeer review

7 Citaten (Scopus)
21 Downloads (Pure)

Samenvatting

This paper investigates the relationship between two complementary perspectives in the human assessment of sentence complexity and how they are modeled in a neural language model (NLM). The first perspective takes into account multiple online behavioral metrics obtained from eye-tracking recordings. The second one concerns the offline perception of complexity measured by explicit human judgments. Using a broad spectrum of linguistic features modeling lexical, morpho-syntactic, and syntactic properties of sentences, we perform a comprehensive analysis of linguistic phenomena associated with the two complexity viewpoints and report similarities and differences. We then show the effectiveness of linguistic features when explicitly leveraged by a regression model for predicting sentence complexity and compare its results with the ones obtained by a fine-tuned neural language model. We finally probe the NLM’s linguistic competence before and after fine-tuning, highlighting how linguistic information encoded in representations changes when the model learns to predict complexity.
Originele taal-2English
TitelProceedings of the Workshop on Cognitive Modeling and Computational Linguistics
RedacteurenEmmanuele Chersoni, Nora Hollenstein, Cassandra Jacobs, Yohei Oseki, Laurent Prévot, Enrico Santus
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's48-60
Aantal pagina's13
ISBN van geprinte versie978-1-954085-35-0
DOI's
StatusPublished - jun.-2021
Extern gepubliceerdJa
EvenementWorkshop on Cognitive Modeling and Computational Linguistics - Online
Duur: 10-jun.-202110-jun.-2021
https://aclanthology.org/volumes/2021.cmcl-1/

Workshop

WorkshopWorkshop on Cognitive Modeling and Computational Linguistics
Verkorte titelCMCL
Periode10/06/202110/06/2021
Internet adres

Citeer dit