AGILe: The First Lemmatizer for Ancient Greek Inscriptions

Evelien de Graaf, Silvia Stopponi, Jasper Bos, Saskia Peels-Matthey, Malvina Nissim

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

203 Downloads (Pure)

Abstract

To facilitate corpus searches by classicists as well as to reduce data sparsity when training models, we focus on the automatic lemmatization of ancient Greek inscriptions, which have not received as much attention in this sense as literary text data has. We show that existing lemmatizers for ancient Greek, trained on literary data, are not performant on epigraphic data, due to major language differences between the two types of texts. We thus train the first inscription-specific lemmatizer achieving above 80% accuracy, and make both the models and the lemmatized data available to the community. We also provide a detailed error analysis highlighting peculiarities of inscriptions which again highlights the importance of a lemmatizer dedicated to inscriptions.
Original languageEnglish
Title of host publicationProceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)
EditorsNicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Place of PublicationMarseille, France
PublisherEuropean Language Resources Association (ELRA)
Pages5334-5344
Number of pages11
ISBN (Print)9791095546726
Publication statusPublished - Jun-2022
EventThe 13th Conference on Language Resources and Evaluation - Palais du Pharo, Marseille, France
Duration: 20-Jun-202225-Jun-2022
https://lrec2022.lrec-conf.org/en/

Conference

ConferenceThe 13th Conference on Language Resources and Evaluation
Abbreviated titleLREC 2022
Country/TerritoryFrance
CityMarseille
Period20/06/202225/06/2022
Internet address

Keywords

  • lemmatizer
  • ancient Greek
  • digital classics

Fingerprint

Dive into the research topics of 'AGILe: The First Lemmatizer for Ancient Greek Inscriptions'. Together they form a unique fingerprint.

Cite this