Corpus Phonetics for Under-Documented Languages: A Vowel Harmony Example

Timothy Kempton, Mary Pearce


Corpus phonetics is enabling the comprehensive analysis of large digital speech collections. In this paper, we develop a corpus phonetics workflow that is flexible enough to be easily applied to under-documented languages. To test the capabilities of this workflow we choose a challenging vowel reduction and vowel harmony problem. In Kera (Chadic) it has been shown (Pearce, 2012), that not only is phonetic reduction linked to the phonetic duration of the vowel, but also that reduction is blocked in vowel harmony domains. We are able to replicate previously published experiments by Pearce that were originally completed using manual measurements. We expect that our corpus phonetics workflow will be of value to phonologists working on other languages.


forced alignment; vowel harmony; vowel reduction; under-documented language; phonology-phonetics interface

