Datasets and Models for Authorship Attribution on Italian Personal Writings

Gaetana Ruggiero, Albert Gatt, Malvina Nissim

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    10 Downloads (Pure)

    Abstract

    Existing research on Authorship Attribution (AA) focuses on texts for which a lot of data is available (e.g novels), mainly in English. We approach AA via Authorship Verification on short Italian texts in two novel datasets, and analyze the interaction between genre, topic, gender and length. Results show that AV is feasible even with little data, but more evidence helps. Gender and topic can be indicative clues, and if not controlled for, they might overtake more specific aspects of personal style.

    Original languageEnglish
    Title of host publicationProceedings of the Seventh Italian Conference on Computational Linguistics, CLiC-it 2020, Bologna, Italy, March 1-3, 2021
    EditorsJohanna Monti, Felice Dell'Orletta, Fabio Tamburini
    PublisherCEUR-WS.org
    Number of pages7
    Volume2769
    Publication statusPublished - 2020
    EventItalian Conference on Computational Linguistics 2020 - Bologna, Italy
    Duration: 1-Mar-20213-Mar-2021

    Conference

    ConferenceItalian Conference on Computational Linguistics 2020
    Abbreviated titleCLiC-it 2020
    CountryItaly
    CityBologna
    Period01/03/202103/03/2021

    Cite this