Viability of Automatic Lexical Semantic Change Detection on a Diachronic Corpus of Literary Ancient Greek

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

39 Downloads (Pure)

Abstract

We apply two measures of lexical semantic change detection to Word2Vec embeddings trained on a diachronic corpus of literary Ancient Greek texts. The two measures are the Vector Coherence, based on the comparison between vectors of the same word in different time periods, and the J, based on the Jaccard coefficient, which quantifies the overlap between the k nearest neighbours in each possible combination of time slices. Through the analysis of the most stable and unstable words detected with both measures, we show that the two measures are effective at finding non-changed words, while Vector Coherence seems to be more reliable than J at detecting changed words. Still, low J could indicate a real semantic change when the same word also has a low Vector Coherence. For both measures, the detection of changed words is hampered by the presence of lemmatization errors in the training corpus.
Original languageEnglish
Title of host publicationThe First Workshop on Data-driven Approaches to Ancient Languages (DAAL 2024)
Subtitle of host publicationProceedings of the Workshop
EditorsColin Swaelens, Maxime Deforche, Ilse De Vos, Els Lefever
Place of PublicationGent, Belgium
PublisherGhent University
Pages47-57
Number of pages11
ISBN (Print)9789078848127
Publication statusPublished - 27-Jun-2024
EventThe First Workshop on Data-driven Approaches to Ancient Languages (DAAL 2024)
- Mercator A104 (Abdisstraat 1, 9000 Ghent, Belgium), Gent, Belgium
Duration: 27-Jun-202427-Jun-2024
https://www.dbbe2024.ugent.be/workshop/

Workshop

WorkshopThe First Workshop on Data-driven Approaches to Ancient Languages (DAAL 2024)
Abbreviated titleDAAL 2024
Country/TerritoryBelgium
CityGent
Period27/06/202427/06/2024
Internet address

Keywords

  • semantic change detection
  • Ancient Greek
  • language modelling
  • ancient language
  • word embeddings
  • word2vec

Fingerprint

Dive into the research topics of 'Viability of Automatic Lexical Semantic Change Detection on a Diachronic Corpus of Literary Ancient Greek'. Together they form a unique fingerprint.

Cite this