Structuralists famously observed that language is "un systeme oil tout se tient" (Meillet, 1903, p.407), insisting that the system of relations of linguistic units was more important than their concrete content. This study attempts to derive content from relations, in particular phonetic (acoustic) content from the distribution of alternative pronunciations used in different geographical varieties. It proceeds from data documenting language variation, examining six dialect atlases each containing the phonetic transcriptions of the same sets of words at hundreds of different sites. We obtain the sound segment correspondences via an alignment procedure, and then apply an information-theoretic measure, pointwise mutual information, assigning smaller segment distances to sound segment pairs which correspond relatively frequently. We iterate alignment and information-theoretic distance assignment until both remain stable, and we evaluate the quality of the resulting phonetic distances by comparing them to acoustic vowel distances. Wieling, Margaretha, and Nerbonne (2011) evaluated this method on the basis of Dutch and German dialect data, and here we provide more general support for the method by applying it to several other dialect datasets (i.e. Gabon Bantu, U.S. English, Tuscan and Bulgarian). We find relatively strong significant correlations between the induced phonetic distances and the acoustic distances, illustrating the usefulness of the method in deriving valid phonetic distances from distributions of dialectal variation. (C) 2011 Elsevier Ltd. All rights reserved.
- FOREIGN ACCENT