The Effects of Character-Level Data Augmentation on Style-Based Dating of Historical Manuscripts

Lisa Koopmans, Maruf Dhali*, Lambert Schomaker

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Citation (Scopus)
102 Downloads (Pure)

Abstract

Identifying the production dates of historical manuscripts is one of the main goals for paleographers when studying ancient documents. Automatized methods can provide paleographers with objective tools to estimate dates more accurately. Previously, statistical features have been used to date digitized historical manuscripts based on the hypothesis that handwriting styles change over periods. However, the sparse availability of such documents poses a challenge in obtaining robust systems. Hence, the research of this article explores the influence of data augmentation on the dating of historical manuscripts. Linear Support Vector Machines were trained with k-fold cross-validation on textural and grapheme-based features extracted from historical manuscripts of different collections, including the Medieval Paleographical Scale, early Aramaic manuscripts, and the Dead Sea Scrolls. Results show that training models with augmented data improve the performance of historical manuscripts datin g by 1% - 3% in cumulative scores. Additionally, this indicates further enhancement possibilities by considering models specific to the features and the documents’ scripts
Original languageEnglish
Title of host publicationProceedings of the 12th International Conference on Pattern Recognition Applications and Methods - ICPRAM
Place of PublicationLisbon, Portugal
PublisherSciTePress
Pages124-135
Number of pages12
Volume1
ISBN (Print)978-989-758-626-2
DOIs
Publication statusPublished - 2023
Event12th International Conference on Pattern Recognition Applications and Methods - ICPRAM - Lisbon, Portugal
Duration: 22-Feb-202324-Feb-2023
https://icpram.scitevents.org/Home.aspx

Conference

Conference12th International Conference on Pattern Recognition Applications and Methods - ICPRAM
Country/TerritoryPortugal
CityLisbon
Period22/02/202324/02/2023
Internet address

Keywords

  • Data Augmentation
  • Document Analysis
  • Historical Manuscript Dating
  • SELF-ORGANIZING MAPS
  • NEURAL NETWORKS
  • SUPPORT VECTOR MACHINES

Fingerprint

Dive into the research topics of 'The Effects of Character-Level Data Augmentation on Style-Based Dating of Historical Manuscripts'. Together they form a unique fingerprint.

Cite this