Abstract
Languages are fundamental to human communication and serve as a means to express social and cultural values. However, many people treat languages as homogeneous entities, disregarding the fact that they are often composed of multiple varieties. These language varieties may be tied to certain geographical locations or the cultural identity of the speakers.
Studying language variation can thus provide valuable insights into how language varieties relate to their linguistic communities. Most language varieties do not correspond to administrative boundaries, such as provinces or states within nations, and neighboring varieties often transition gradually.
In this dissertation, we presented a new method to describe and model linguistic diversity. Specifically, we leveraged deep learning or artificial neural network models to quantify differences between the pronunciations of speakers from different language varieties. This new method assesses the differences between language varieties more accurately and efficiently compared to previously-used methods.
Additionally, we investigated the use of these neural network models to develop speech technology to help empower language varieties. We developed an audio-based search algorithm that can automatically identify occurrences of a spoken search term in a large collection of spoken materials, improving access to resources that would normally require manual annotation. Furthermore, we presented approaches to improve speech recognition performance for several language varieties from different language families. This technology could, for example, be used to generate subtitles for videos or television broadcasts. This can be a promising step towards the important goal of developing speech technology that is inclusive of the world’s languages.
Studying language variation can thus provide valuable insights into how language varieties relate to their linguistic communities. Most language varieties do not correspond to administrative boundaries, such as provinces or states within nations, and neighboring varieties often transition gradually.
In this dissertation, we presented a new method to describe and model linguistic diversity. Specifically, we leveraged deep learning or artificial neural network models to quantify differences between the pronunciations of speakers from different language varieties. This new method assesses the differences between language varieties more accurately and efficiently compared to previously-used methods.
Additionally, we investigated the use of these neural network models to develop speech technology to help empower language varieties. We developed an audio-based search algorithm that can automatically identify occurrences of a spoken search term in a large collection of spoken materials, improving access to resources that would normally require manual annotation. Furthermore, we presented approaches to improve speech recognition performance for several language varieties from different language families. This technology could, for example, be used to generate subtitles for videos or television broadcasts. This can be a promising step towards the important goal of developing speech technology that is inclusive of the world’s languages.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 16-Nov-2023 |
Place of Publication | [Groningen] |
Publisher | |
DOIs | |
Publication status | Published - 2023 |