Direct Speech Quote Attribution for Dutch Literature

Andreas van Cranenburgh, Frank van den Berg

OnderzoeksoutputAcademicpeer review

1 Citaat (Scopus)
108 Downloads (Pure)

Samenvatting

We present a dataset and system for quote attribution in Dutch literature. The system is implemented as a neural module in an existing NLP pipeline for Dutch literature (dutchcoref; van Cranenburgh, 2019). Our contributions are as follows. First, we provide guidelines for Dutch quote attribution and annotate 3,056 quotes in fragments of 42 Dutch literary novels, both contemporary and classic. Second, we present three neural quote attribution classifiers, optimizing for precision, recall, and F1. Third, we perform an evaluation and analysis of quote attribution performance, showing that in particular, quotes with an implicit speaker are challenging, and that such quotes are prevalent in contemporary fiction (57%, compared to 32% for classic novels). On the task of quote attribution, we achieve an improvement of 8.0% F1 points on contemporary fiction and 1.9% F1 points on classic novels. Code, data, and models are available at https://github.com/anonymized/repository.
Originele taal-2English
TitelProceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
RedacteurenStefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
UitgeverijAssociation for Computational Linguistics (ACL)
Pagina's45-62
Aantal pagina's18
DOI's
StatusPublished - 2023
Evenement7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature - Dubrovnik, Croatia
Duur: 5-mei-20235-mei-2023

Conference

Conference7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Land/RegioCroatia
StadDubrovnik
Periode05/05/202305/05/2023

Vingerafdruk

Duik in de onderzoeksthema's van 'Direct Speech Quote Attribution for Dutch Literature'. Samen vormen ze een unieke vingerafdruk.

Citeer dit