An Acoustic Analysis of Tone and Register in Louma Oeshi

This study describes the acoustic properties associated with tone and register in Louma Oeshi, a previously unstudied Akoid language of Laos. Louma Oeshi uses three tones (High, Mid, and Low) which overlap with a tense/lax register distinction to yield a six-way suprasegmental contrast. In this paper, we (1) offer a first account of the pitch and voice quality characteristics associated with each Tone-Register pair, (2) examine further the variability in glottalization strategies signaling the constricted register, and (3) explore the influence of contrastive voice quality on pitch and vice versa, particularly as a predictor of the variation in glottalization.

contrast yields 5 possible combinations, because a High Constricted combination is not permitted.
Kuang notes that, although they are not phonologically independent, phonation and pitch in the Yi languages are phonetically independent.Her study found that speakers of Southern Yi, Bo, and Hani produce tones such that "phonation can be kept constant while changing pitch, and pitch can be kept constant while changing phonation (2013: 42)."Lewis (1973) and Hansson (2003) note that the High Tense combination may occur in Akha, though neither provides examples and both state that it is very rare and primarily restricted to place and personal names.The High Tense category similar seems to bear a low functional load for Oeshi.The speakers confirmed the existence of a High Tense tone and though early fieldwork sessions explicitly aimed for eliciting lexical items, only four High Tense lexical items could be identified for a 100-item wordlist.One confound seems to be that tone and phonation can bear grammatical functions (as in Akha, per Hansson 2003) and thus words which may at times be produced as High Tense are in fact underlyingly Mid Tense or High Lax.Frequently, a final glottal stop marked citation forms for some verbs and nouns, though this phenomenon does not occur across all lexemes or particular word types and requires further investigation.

Methodology.
The study seeks to provide a first account of the phonetic characteristics associated with each Oeshi tone-register pair, focusing on the interaction of pitch and voice quality.Measures of pitch and voice quality were taken dynamically over the course of the syllable rime and hierarchical regression modeling was used to test the effect of tone and register setting on each acoustic correlate.
3.1 PARTICIPANTS.Data is drawn from recordings with two male speakers in Mueang Mai District, Phongsaly Province, Lao PDR.In total, four male speakers participated, but lessthan-ideal recording conditions (see §3.3) limit us to presenting data from the two individuals who spoke loudly and clearly enough for suitable analysis.Speaker A is from Moutern [mutən] village and was 22 years old at the time of recording.He has a relatively high level of education (one year of secondary school) and is a proficient Lao speaker.Speaker B hails from a neighboring village and was 19 years old for the recording.He is a mostly monolingual Oeshi speaker, having attended 3 years of school.He can comprehend, read and write Lao with some effort, but generally avoids doing so in his day-to-day life.
Village life is monolingual in Oeshi, with Lao and Khmu used by speakers who frequently leave the village.All Louma children are monolingual, as are their mothers.While children receive education in Lao, most drop out before developing any Lao proficiency.
3.2 MATERIALS.Each speaker repeated a 30 item word list that constituted roughly 5 near-minimal sets of the six-way contrast (see word list in Table 1 below).Each word was uttered twice in isolation and in two carrier phrases, one which positioned the keyword between two lax mid-tones and another which used the keyword sentenceinitially before a tense low-toned word.In the analysis, these three contexts are referred to as (1) Phrase-medial, (2) Phrase-initial, and (3) Citation form.Only the first citation form utterance was analyzed with extra repetitions discarded.
( In analysis, fundamental frequency could be determined with confidence, but measures of spectral information reflecting voice quality were potentially hindered by the poor recording quality.
Instructions were provided in Lao to a bilingual Lao-Oeshi interpreter.Prior to the data elicitation, subjects provided recorded verbal consent to the research procedures in Oeshi.The interpreter reviewed the wordlist with each subject prior to recording and then prompted each word/utterance in a randomized order for elicitation.
3.4.ANALYSIS.Recordings were manually segmented using Audacity and processed in Praat 5.4 (Boersma & Weenink 2015).Decile measures of F0 and a suite of measures investigating phonation types (H1-H2, H1-A1, H1-A3, HNR, and SHR) were performed in Voicesauce (Shue et al 2011) and exported for statistical analysis by regression modeling in R (R Core Team 2012).The variety of acoustic measures employed have become common in recent phonetic literature regarding voice quality (e.g.Esposito 2006, Khan 2012, Berkson 2013, Kirby 2014).These measures are generally of two types; (i) those which take the spectral profile of a vowel and look at aspects of it which are thought to be indicative of the glottal source vibration (spectral slope, SHR) and (ii) those which measure the amount of noise generated at the glottis (HNR, SHR).
The fixed effects of Tone (H, M, L), Register (Lax, Tense) and Frame (Initial, Medial, Citation) were tested in hierarchical linear regressions with each of the above measures as a dependent variable (DV).The individual word was included as a random effect, thereby mediating the degree to which production of a single word may influence the results.Values for each measure were tested as overall means as well as the mean at each decile point.Regression tests for each DV were initiated with a model that included each term (Tone, Register, Frame) and their interactions.Insignificant interactions and factors were step-wise pruned from the model until a final model was reached which only included significant factors.
Results of the regression tests are provided in Section 4 for the overall mean and value at the 8 th decile, i.e. a measure taken at the 4/5 ths point of the vowel's duration.This location is provided because it yielded the strongest statistical confirmation of a contrast between the tones, a result which reflects the importance of the vowel terminus as a locus of contrastive voice quality and pitch movement.

Results.
Acoustic analysis reveals a reliable 3-way F0 contrast between mostly level tones and a single interaction with register such that High Tense words consistently fall in pitch.Acoustic correlates of the Tense~Lax distinction are less clear, though the corrected H1-H2* most often captures the Oeshi tense register, particularly Low Tense forms.
Figures 1 and 2 offer representative examples of Lax and Tense syllables as produced by Speaker A in the Citation form context. Glottalization in the Tense form (Figure 2) is subtle but apparent in the uneven striations of the glottal pulses approaching the end of the vowel.In other cases, the vowel of a Tense syllable displays very few indicators of a constricted glottal state.Rather, the Tense register is manifest most prominently as glottalization of the onset or coda consonant, or even complete replacement with a final glottal stop.The nature of this variation -whether the precise phonetic realization of "tenseness" is a product of certain speakers, words, consonants or tones -is a worthwhile matter for future research but is not explored further in this paper.Comprehensive presentation of results for each relevant acoustic measure are similarly beyond the scope of the present work.In the sections below, we limit our presentation to results for F0 and H1-H2*, and then summarize the statistical inferences reached for the array of other measures.
Differences in pitch and phonation properties between the two sentential contexts were weak to non-existent.Admittedly, the two phrases bore similarities -the phrasemedial frame sentence placed the token of interest as the second overall element in the sentence, which may offer negligible contrast from the first overall position of phraseinitial contexts.Furthermore, tokens were generally produced with a topic focus thereby imbuing the embedded tokens with a pronunciation that sometimes leans toward citationform qualities.Ultimately though, the citation form utterances proved to have distinctive influences on the form of the words under investigation, such as a distinct final glottal stop missing in the sentence-medial form.4.2.FUNDAMENTAL FREQUENCY.The mean F0 values at each time decile for each Tone-Register pair are plotted in Figure 3 (Speaker A) and Figure 4 (Speaker B).Lax register forms are represented by a solid line, while tense register means are a dashed line.Plainly, a three-way divide in pitch height is evident, but overall four distinct F0 contours emerge: a low, a mid, a high falling and high rising.
Note that there is very little overlap between the mean F0 values for the three tone levels.In fact, outside of the High Tense forms, which sometimes fall below Mid tone values late in the vowel, there is in fact no overlap between the means.However, though not overlapping, separation between the means was often quite narrow.For example, Low Lax tokens of Speaker B were less than 10 Hz lower than the Lax and Tense Mid tones in citation form and phrase-initial words (Figures 4b and 4c).
Generally, register had little effect on the F0 of Low and Mid tone items.That is, F0 in Lax syllables was similar to F0 in Tense syllables.A regular difference conditioned by register was between the High Lax and High Tense forms, the former rising or staying high even and the latter consistently realized as a fall.The late F0 values on the falling High Tense forms often fall below the Mid tone levels.
Regarding the effect of phrasal context, it is interesting that the clearest 3-way F0 contrast was found in the Phrase-medial condition, where a roughly 20 Hz separation was maintained between adjacent tone levels.
Statistical significance of F0 measures are given in Table 2, where each row displays the result of a separate regression test in which the occupied cells indicate the final regression model.Reference levels for each factor were: the M(id) tone for Tone; Lax for Register, and Phrase-medial for Frame.To demonstrate, for mean F0 as a DV, the model for Speaker A shows that: • The Low tone as well as the High had a highly significant effect (***) on the mean F0 as differentiated from the mean Mid F0 level.• The Tense register did not significantly alter F0 from the Lax reference level and was not included in the model.As we might expect, Tone was a highly significant predictor of F0 -the mean F0 over the duration of the vowel as well as the F0 measured at various decile timepoints.Register, conversely, did not affect F0 except that there was an interaction for both speakers such that Tense tokens which were also High-toned had significantly lower F0 late in the vowel (the 80% decile).The effect of frame was such that Citation form utterances were significantly different from Phrase-medial forms.Looking to Figures 3-4, it can be seen that this effect is one of overall lower F0 values in the citation tokens.
4.3 MEASURES OF PHONATION.The array of measures distinguishing Lax from Tense register in Oeshi paint a much messier picture than does F0 for tone.Still, a loose pattern emerges wherein Tense tokens typically have lower values for each of the measures examined: H1-H2*, H1-A1*, H1-A3*, HNR 0-5k, HNR 3-5K, and SHR.Nevertheless, few of these differences were large enough or consistent enough to be statistically significant (see Tables 3, 4).Interestingly, Tone was a decent predictor of H1-H2*, patterned loosely such that the progression High > Mid > Low was progressively more constricted.This finding corresponds to the most consistent statistical finding for phonation -register as a predictor of H1-H2* on Low tones (and on High tones for just Speaker A).This effect is seen in the interactions in Table 3, interactions between tone and register that effectively indicate that there was a lack of a difference between Tense and Lax registers primarily on Mid tone forms.Although H1-A3* found a small significant effect of Tone for one speaker, our interpretation of these results is that H1-H2* was the most reliable metric of Oeshi register.Traces of mean H1-H2* over the vowel duration is given in Figures 5, 6.

DISCUSSION
Analysis of the speech of two Oeshi men has shown that F0 is reliably distinct between each of the tones but that measures of phonation type less consistently distinguish the Lax and Tense registers.A poor recording environment and quality of the recordings may in some part serve as a confound to these measures.Other potential sources may be linguistic however.
For one, the articulatory basis of Oeshi lax or tenseness may simply not be reflected well by certain measures.In this case, the present study is hampered by a small sample size.More speakers producing more words may offer a fairer sense of the acoustic difference in Oeshi registers.
More enticingly, our impressionistic analysis of Oeshi speech (both in person and in the recordings) suggests that register behaves as a property of the syllable, being realized by one or more of the following laryngeal modifications: (i) aspiration/preglottalisation of the onset, (ii) non-modal voice quality on the vowel (often solely at the vowel terminus), or (iii) by a weak to full [h] or [ʔ] coda.
While the present data face many limitations, they also offer a satisfying glimpse into the suprasegmental phonetics of a previously undescribed language.The analysis provides a novel datapoint in a typological arena of much recent interest in the literature -laryngeally complex tones with concomitant pitch and phonation properties.A quick cognate comparison with other Southern Ngwi languages (Hansson 1989) suggests that Oeshi, or Oeshi and Louma, indeed confirm Gregerson et al's (2011) suggestion to classify Oeshi (or perhaps Louma as an encompassing variety) as a distinct language.An example is the form for Proto-Lolo *wàk 'pig'.Sila and Oeshi retain onset and final constriction in /wà̰ /, Akha and most Hani varieties are /ɣa/.Similarly, 'iron' /ɕéŋ/ looks more like Mpi /siŋ⁵/ than Akha /sjhḿ/ or Hani /só/.In the context of similar mixed pitchphonation systems in SE Asia, the possible typological status of this language variety suggest implications for models of historical reconstruction, tonogenesis and registrogenesis.
Many citation forms contain a final glottal stop, which is not present in utterancemedial position.The conflicting readings for final glottal constriction may go back to either emphasis, or to a particular affix if uninflected forms are prohibited by Oeshi phonotactics.This requires further investigation of the Oeshi phonological-syntactic interface, especially since grammaticalization of tone and phonation is common for Tibeto-Burman languages and has been observed for the related Southern Ngwi language Akha (Hansson The phonatory alternation may go back to a form of affixation, further supporting that Tibeto-Burman languages are not strict morphologically isolating (Brunelle & Kirby 2016).
The rare but present occurrences of the High Tense tone invite a closer crosslinguistic look at possible mechanisms for either development or loss of High Tense tone in Southern Ngwi languages.Whereas modern Tibeto-Burman languages usually employ mixed pitch-phonation system (à la Burmese, see Gruber 2011), the intersection of tone and phonation possibly points at an older stage of tone-and registrogenesis preserved in Oeshi.
The manifestation of laryngeal constriction in Oeshi at the onset, the kernel, and in coda position makes a look at underlying proto forms worthwhile.An attested form with both initial and final glottal constrictions [ˀgùʔ] 'rice grain' was reconstructed as *ʔgaw for Proto-Lolo (Bradley 1977b).A detailed comparison of the locality of glottal constriction in Oeshi with that of oral or glottal closure points in Proto-Lolo or Proto-Tibeto-Burman may provide further insight on the development of tone and phonation in Southern Lolo/Ngwi in particular and Tibeto-Burman in general.

Figure
Figure 3. F0 for Speaker A across 3 contexts • The Citation form utterances were significantly affected (**) from the Phrasemedial level, but Phrase-initial tokens were not.