Fine-Tuning Strategies for Dutch Dysarthric Speech Recognition: Evaluating the Impact of Healthy, Disease-Specific, and Speaker-Specific Data

Spyretta Leivaditi*, Tatsunari Matsushima, Matt Coler, Shekhar Nayak, Vass Verkhodanova

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

28 Downloads (Pure)

Abstract

Despite significant advancements in automatic speech recognition technology (ASR) the performance of such systems on dysarthric speech is still inadequate for widespread use. One key reason is the lack of sufficiently rich and diverse dysarthric speech datasets to train machine learning models that could handle all types and varieties of such speech. Motivated by the data scarcity problem, as well as by successful applications of self-supervised learning (SSL) in ASR for low-resource languages, this paper investigates and evaluates the effectiveness of three different data-centric SSL training strategies in improving Dutch dysarthric speech recognition. The first strategy involves fine-tuning with both dysarthric and healthy speech data, the second with disease-specific data and the third with speaker-specific data. The first and third strategies are proven effective, while the second one, though ineffective, provides valuable insights for further research.

Original languageEnglish
Title of host publicationProceedings of Interspeech 2024
PublisherISCA
Pages1295-1299
Number of pages5
DOIs
Publication statusPublished - 1-Sept-2024
EventInterspeech 2024 - Kos, Greece
Duration: 1-Sept-20245-Sept-2024

Conference

ConferenceInterspeech 2024
Country/TerritoryGreece
CityKos
Period01/09/202405/09/2024

Keywords

  • sarcasm
  • speech acoustics

Fingerprint

Dive into the research topics of 'Fine-Tuning Strategies for Dutch Dysarthric Speech Recognition: Evaluating the Impact of Healthy, Disease-Specific, and Speaker-Specific Data'. Together they form a unique fingerprint.

Cite this