Sonority-driven stress and vowel reduction in Uyghur*

It is well known that the placement of stress may be conditioned upon syllable-internal properties, most notably syllable weight. A large body of literature has investigated the interaction of stress placement and syllable weight, also known as quantity-sensitivity (e.g. Hayes 1980, 1995; Halle & Vergnaud 1987; Gordon 2006; Hao & Andersson 2019; Koser & Jardine 2020). In languages with quantity-sensitive stress, stress preferentially falls on heavy syllables. In addition to quantity-sensitivity, a number of researchers have argued that stress may also exhibit sonority-sensitivity, where stress interacts with vowel sonority (Kenstowicz 1994, 1997; Morén 2000; de Lacy 2004, 2006; Crowhurst & Michael 2005). By and large, this work on sonoritysensitivity has focused on sonority’s influence in languages with variable stress placement. In these languages, stress is preferentially attracted to high-sonority vowels, e.g. /a/, concomitantly avoiding less sonorous vowels, e.g. high vowels (cf. Shih 2016, 2018). As an example, consider the Gujarati data in (1). Stress typically falls on the penult (1a,b). However, when the vowel in the penult is less sonorous than another vowel stress retracts to the antepenult (1c,d) or shifts to the ultima, yielding a degenerate foot (1e,f).


Introduction
It is well known that the placement of stress may be conditioned upon syllable-internal properties, most notably syllable weight. A large body of literature has investigated the interaction of stress placement and syllable weight, also known as quantity-sensitivity (e.g. Hayes 1980Hayes , 1995Halle & Vergnaud 1987;Gordon 2006;Hao & Andersson 2019;Koser & Jardine 2020). In languages with quantity-sensitive stress, stress preferentially falls on heavy syllables. In addition to quantity-sensitivity, a number of researchers have argued that stress may also exhibit sonority-sensitivity, where stress interacts with vowel sonority (Kenstowicz 1994(Kenstowicz , 1997Morén 2000;de Lacy 2004de Lacy , 2006Crowhurst & Michael 2005). By and large, this work on sonoritysensitivity has focused on sonority's influence in languages with variable stress placement. In these languages, stress is preferentially attracted to high-sonority vowels, e.g. /a/, concomitantly avoiding less sonorous vowels, e.g. high vowels (cf. Shih 2016Shih , 2018. As an example, consider the Gujarati data in (1). Stress typically falls on the penult (1a,b). However, when the vowel in the penult is less sonorous than another vowel stress retracts to the antepenult (1c,d) or shifts to the ultima, yielding a degenerate foot (1e,f).
(1) Gujarati stress ( Recent experimental work has however cast doubt on the generalizations illustrated above (Shih 2018;Shih & de Lacy 2019;Bowers 2019). Both Shih (2018) and Bowers (2019) find no evidence for the stress claims advanced in de Lacy (2006). In fact, Shih's (2018) reports fixed penultimate stress while the findings in Bowers (2019) provide weak support for fixed initial stress. Despite their differences, the two studies contradict earlier claims and contend that Gujarati should not factor into discussions on sonority-sensitive stress. Extrapolating from this finding, Shih & de Lacy (2019:16) pose the question, "So, is there solid evidence for a theory that claims there is a phonological mechanism that directly relates sonority to foot structure, thereby causing foot retraction and degeneration?" Shih & de Lacy's question assumes an intimate link between sonority-sensitivity and variable stress placement. At a foundational level, any answer to their question must first address the assumed correlation between the two. To that end, this paper examines sonority-sensitivity in a language with fixed stress placement. Using production data from Uyghur, I argue that a sonority is encoded as a weight distinction in the language, which accounts for the augmentation of stressed high vowels as well as positional reduction of low vowels.

Uyghur
2.1 Background Uyghur is a Turkic language with over ten million speakers in Central Asia. The language possesses an inventory of at least seven contrastive vowels, /ɑ ae o ø u y i/. Two other vowels have figured into analyses of the language, /e/ and /ɯ/. The mid front unrounded vowel is marginal, occurring in foreign loans or as the result of umlaut (e.g. /bɑl-i/ [beli] 'honey-POSS.3S'). The high back unrounded vowel is more controversial. As is common among Turkologists, Hahn (1991) posits underlying /ɯ/ that undergoes absolute neutralization to [i] (see also Lindblad 1990). Using acoustic evidence from harmony, McCollum (2019) suggests that /ɯ/ may contrast with /i/ in the language (cf. Vaux 2000 for a critique of this general set of analyses). Regardless of the underlying status of /ɯ/, it is clear that [ɯ] surfaces in the language, and the paper assumes a surface inventory of nine vowels. Backness and rounding harmony are both operative in the language. Backness harmony affects all non-initial vowels while rounding harmony affects high vowels only (Hahn 1991;McCollum 2019). Both harmonies are subject to an additional restriction: word-final high vowels resist both harmonies, surfacing as [i].
Uyghur allows a variety of syllable structures, ranging from V to CVC. It should be noted that although CVCC syllables are present orthographically, in most of these cases, either the final consonant is deleted or an epenthetic vowel is inserted between the two coda consonants (Hahn 1991(Hahn , 1998. V and CV syllable types are considered light, while CVC is considered heavy. Some analyses have posited long vowels as the result of historical changes and loans from Arabic and Persian (Hahn 1991;Yakup & Sereno 2016). In the present study, only native and older (nativized) borrowings were examined.

Stress
Descriptions of stress placement in Uyghur have varied significantly. Thus, the impetus for the experiment described below was to further ascertain the nature of stress, how it is realized, and where it falls in the language. As a first step toward that end, the paper focuses on primary stress, leaving reported secondary stress for future work. Impressionistic descriptions of primary stress placement are listed in (2).
Somewhat interestingly, Hahn's two descriptions of the language differ in their assessment of stress. In his 1991 grammar, he suggests that stress is default-to-same (in the terms of Prince 1985), although in his 1998 book chapter he indicates that stress is final. Moreover, Engesaeth et al (2010) contend that stress is defaultto-opposite, falling on the leftmost heavy syllable, or else the ultima. In addition to stress placement, there are few consistencies in the description of stress realization. Hahn (1991) and Engesaeth et al (2010) indicate that stress is realized by a high tone (pitch accent). Hahn also notes that increased intensity is a secondary correlate of stress. Additionally, Hahn reports that unstressed syllables, he notes, are subject to reduction and devoicing. In contrast to these general, impressionistic claims, Yakup & Sereno (2016) as well as Major & Mayer (2018) present experimental evidence supporting duration as the only reliable cue to stress placement in the language. My own observations corroborate the findings in Yakup & Sereno (2016) and Major & Mayer (2018), with duration servicing as the only significant cue to stress.

Experiment
Since final stress is the common denominator among the previous descriptions in (2), this study examines if stress falls on the ultima, and if so, if it is realized by increased vowel duration.

Speakers
Data was collected in Chunja, which is the seat of the Uyghur district of the Almaty region in southeastern Kazakhstan. Nine speakers participated in the study (4 females; age range 19-63; mean age 44 yrs), producing 6,836 syllables for acoustic analysis.

Task
Data collection proceeded in two phases. Participants were first taught a set of pictorial-lexical correspondences. For instance, a picture of a purple flower prompted the word /ɡyl/ 'flower' while a picture of a frog prompted the word /pɑqɑ/ 'frog'. After learning this set of correspondences, participants were taught a set of pictorial-grammatical prompts indicating number, case, and possession. As an example, two red arrows pointing down indicated locative case, /-dae/, while two side-by-side copies of the target prompt indicated plural number, /-laer/. In total, six suffixes were elicited that varied in both syllabic shape and underlying vowel height, shown in (3). Note that /-m/ 'POSS.1S' (3c) surfaces as a single consonant after vowel-final stems, e.g. /pɑqɑ-m/ [pɑqɑm] 'frog-POSS.1S.', However, when the stem-final segment is a consonant, this consonant is preceded by an epenthetic high vowel that agrees with the preceding vowel in both backness and rounding, e.g. /ɡyl-m/ [ɡylym] 'flower-POSS.1S'. Mid vowels are limited to initial syllables only, so all cross-height comparisons are between high and lows vowels.
(3) Suffixes elicited a. CVC: /-laer/ 'PL' Target words were prompted from images on a laptop computer, and were produced in isolation. Elicited words were up to five syllables in length.

Analysis
After data collection, all words were segmented in Praat (Boersma & Weenink 2019). Since recent experimental work supports duration as the primary acoustic correlate of stress, vowel and consonant durations (in ms.) were measured. Segment boundaries aligned to spectrographic landmarks; the waveform was only consulted in instances where the spectrographic landmark were unclear. Vowel onset and offset were defined as the onset and offset of the second formant (F2), and in cases where the vowel was immediately followed by a sonorant consonant, the offset was defined as the point of abrupt decrease in intensity. Consonantal onset and offsets were determined in the same general manner. After segmentation, the data were analyzed in R using the lme4 package (Bates et al. 2014). A linear mixed effects model was used to predict vowel duration from vowel height, position (non-final or final), and syllable type (open or closed). The model's random effect structure included random intercepts for speaker, vowel height (high or low), and word length (one to five syllables). By-speaker random slopes for vowel height and word length were also included in the model. Statistical significance was assessed using model comparisons.
Three generalizations are evident in Table 1 and Figure 1. Table 1 reports model estimates for mean vowel duration by height, position, and syllable type. These two both show that, first, final-syllable vowels are longer than other vowels. This result provides strong support for the recent claims that stress in Uyghur is realized with increased vowel duration (Yakup & Sereno 2016;Major & Mayer 2018). Second, low vowels are longer than high vowels. This is not terribly surprising, as low vowels are typically longer than high vowels (Lehiste 1970;Lisker 1974; see also Toivonen et al. 2015 for recent discussion). Third, vowels in closed syllables are longer than vowels in open syllables. Although this runs counter to the quasi-universal pattern of closed syllable vowel shortening (Maddieson 1985), it is not uncommon for the language family (Lahiri & Hankamer 1988;Jannedy 1995). Interestingly, these generalizations do not account for the realization of high vowels in final open syllables. In this particular context, the high vowels are dramatically lengthened, and even approximate the duration of low vowels in final position. In particular, this result suggests that describing stress is not quite as straightforward as previous work suggests. Vowels in final syllables are lengthened, but the degree of lengthening varies significantly by vowel height and syllable type. The study's findings are translated into a more phonological representation in Table 2. Data for low vowels and for closed-syllable final vowels is consistent with several previous descriptions, with stress falling on the final syllable, realized by phonetic lengthening. However, in the upper right portion of Table 2, I have transcribed the lengthening effect on high vowels in open syllables as phonological. The next section develops this, and in Section 6 I lay out an Optimality theoretic account of the Uyghur data.

Analysis
Throughout this section I develop the analysis foregrounded above. Concretely, I contend that stress in Uyghur is final and weight-sensitive, with vowel sonority being encoded as a weight distinction. Specifically, high vowels are monomoraic, while non-high vowels are bimoraic. My analysis makes two additional claims: Uyghur requires stressed syllables to by heavy, and that codas contribute to syllable weight. I justify these claims in the following subsections.

Weight
There are four sources of evidence that suggest the importance of weight for Uyghur stress, encoded here as the mora. First, as argued in Hayes (1989), compensatory lengthening is one of the best heuristics for the mora (see also Kavitskaya 2014). In casual speech, the /r/ of the plural suffix /-laer/ is often deleted, and when it is, the preceding vowel lengthens (Nadzhip 1971:69;Hahn 1991:55-56). Thus, in a word McCollum like /saellaelaer/ 'turban-PL' optional deletion of the plural-final consonant induces lengthening of the (now word-final) /ae/, yielding [sael-li-laeː]. Of the nine speakers who participated in the study, six of the nine regularly exhibited compensatory lengthening; the other three did not delete final /r/ of the plural suffix. Before moving on, it is worth noting that most instances of compensatory lengthening occurred in the final syllable.
In addition to compensatory lengthening, the pattern of closed-syllable vowel lengthening above provides suggestive evidence for the mora. In Maddieson (1985), the only language discussed that does not shorten vowels in closed syllables is Japanese, a language often associated with the mora. If the use of the mora is reflected in phonetic lengthening of closed syllables, then the mora may be applicable more generally to the larger Turkic language family. Finally, recall from above that some previous descriptions of Uyghur report that weight is predictive of stress. Engesaeth et al (2010) argues that primary stress preferentially falls on a heavy penult, and in addition, Hahn (1991:28) claims that secondary stress falls on heavy syllables.
If Uyghur stress interacts with weight, then we can make sense of the lengthening of word-final high vowels. These vowels are monomoraic, and without a coda to contribute a mora, these syllables are light. If we assume that Uyghur requires stressed syllables to be heavy (weight-to-stress; Prince 1990; Prince & Smolensky 1993/2004, then lengthening is a means to satisfy this particular constraint in the language.

Moraic codas
The claim that word-final high vowels undergo lengthening to satisfy a constraint against light syllables in stressed position requires some more justification, specifically the moraic status of codas. Again, the best source of evidence here is the pattern of compensatory lengthening reported above. I found few examples of /r/ deletion when the final consonant of the plural suffix was in onset position (e.g. /saellaelaeri/ [sael.li.ri] 'turban-PL-POSS.3S', and when I did, there was no lengthening of the preceding vowel. As discussed in Hayes (1989), the fact that coda consonants may contribute a mora while onsets typically do not is supported by a range of patterns like that in Uyghurcoda consonant deletion yields compensatory lengthening while onset consonant deletion does not.
Building on the previous subsection, closed-syllable vowel lengthening lends credence specifically to the moraicity of codas. In Uyghur, the addition of an onset does not induce phonetic lengthening, but a coda does. In general, I assume that the polymoraic structure of closed syllables is fairly directly reflected in the phonetic patterns of the language, in line with results from Broselow et al (1997).
One final piece of evidence for the moraicity of codas comes from vowel raising. In medial open syllables, non-high vowels raise to high. In (5a-c), note that raising does not occur in final syllables or in medial closed syllables. However, in medial open syllables /ae/ raises to [i] (5d-f; in accordance with backness harmony, /ɑ/ raises to [ɯ]). Observe in (5f) that raising may target multiple vowels. Vowel raising in Uyghur is analyzable as reduction via mora deletion. Framed this way, the language employs raising as a means to enhance the syntagmatic contrast between stressed and unstressed syllables. Given binary distinctions for vowel height and syllable type, four types of syllables are permissible in the language: CV [-hi] , CV [+hi] , CVC [-hi] , and CVC [+hi] . If high vowels are monomoraic and low vowels are bimoraic and codas contribute a mora, then CV [+hi] is the only monomoraic syllable type, while CV [-hi] and CVC [+hi] possess two moras, with CVC [-hi] bearing three moras. If closed syllables with a [-hi] vowels possess this moras, this would explain why these vowels are immune to raisingraising would not produce a light syllable.
One might ask whether initial-syllable vowels are subject to raising as well. The Uyghur lexicon exhibits numerous cases of initial-syllable vowels that are higher than those in closely related languages. Some of these are shown in Table 3. Notice that in the top three rows, the second-syllable vowel is [+hi]; in such cases, fronting and raising is common across dialects (Nadzhip 1971;Hahn 1991;Yakup 2005).  1971:53-55;Hahn 1991:51-52). One important aspect of this pattern is that vowels in closed syllables are immune to this pattern. With the root /jɑn/ 'return', the addition of the gerundive suffix /-ʃ/ triggers vowel epenthesis and initial-syllable raising [je.nɯʃ] 'return-GER'. However, when the causative suffix /-dur/ is attached to this root, no such raising occurs [jɑn.du.ruʃ] 'return-CAUS-GER' because the initial syllable is closed (Nadzhip 1971:54-55). Thus, initial-syllable raising is blocked in the same contexts as the medial raising pattern noted abovein a closed syllable. The fact that syllable type is predictive of raising in all non-final syllables strongly suggests that codas are moraic.  [hi]. There are cases of final vowel raising in connected speech, but this appears qualitatively different from the types of raising just discussedthese occur at all speech rates.
In sum, the asymmetric lengthening of high vowels in word-final position can be explained as the addition of a mora. High vowels are monomoraic, and in the absence of a coda, the vowel is lengthened to satisfy a weight-to-stress constraint.
It is thus possible to maintain a binary weight distinction if codas are only moraic after [+hi] vowels in Uyghur. Under Broselow et al's (1997) analysis, coda consonants might share a mora with the preceding vowel if that vowel was [-hi]. To test this prediction, I examined the duration of coda /m/ based on preceding vowel height, with random intercepts for speaker and word length (in syllables). Coda /m/ was 8.2 ms. longer after [+hi] vowels [(n = 805), χ 2 (1) = 7.2, p < .01], as seen in Figure 2 below. Although this finding is consistent with Broselow and colleague's analysis of various Arabic dialects the magnitude of the differences is important. In Jordanian, Syrian, and Lebanese Arabic, Broselow et al (1997) report differences in coda duration ranging from about 15 to 35 milliseconds, whereas the difference reported here is only 8 milliseconds. Moreover, perceptual findings suggest that the size of this lengthening effect is not noticeable, and thus not likely to be manipulated in the phonology of Uyghur (Huggins 1972;Klatt & Cooper 1975).

Final lengthening?
The analysis proposed in the previous section is straightforward, but the reader may wonder whether or not the entire analysis is confounded by the particular recording context. Linguists have mistakenly identified a number of stress-related properties in the past, particularly in less naturalistic data collection scenarios (see e.g. Dobrovolsky 1999;Gordon 2000;Goedemans & van Zanten 2007;Karlsson 2014 for discussion). In such instances either intonational or phonetic patterns are confused with word-level phonological patterns. Since the target words were elicited in isolation, it is important to address the question: does the duration pattern reported above actually fall out from phonetic final lengthening (Edwards et al 1991;Wightman et al 1992)? In the case of Uyghur, final lengthening is the most likely source of confusion since previous work has described the intonational pattern in Uyghur (Major & Mayer 2018), and additionally, the acoustic correlate of stress in the language, duration, is the primary manifestation of final lengthening, as well. In my estimation, the Uyghur pattern does not fall out from phonetic lengthening for two key reasons, the asymmetry of this durational pattern, and the absence of boundary-related proximity effects.
First, final lengthening should produce a relatively symmetrical effect on both high and low vowels in word-final position. Final lengthening is not conditioned upon phonological category, but rather involves a mechanical transition toward an articulatory rest state. However, the degree of lengthening on high vowels far exceeds that on low vowels, as shown in Figure 3. If the effect were phonetic in nature, we would expect a smaller effect on high vowels in open final syllables, closer in magnitude to the effect observed in other contexts. Figure 3: Percent increase in duration of stressed syllables by vowel height and syllable type One additional property of phonetic lengthening is that it affects segments closer to the prosodic boundary more significantly (Berkovits 1993a(Berkovits ,b, 1994. The role of proximity has been modeled with prosodic gestures in Articulatory Phonology, where the gesture gradually decreases in effect further from the edge of the prosodic unit (Byrd & Saltzman 2003;Byrd et al. 2006). If final lengthening is driving the differences above, then we should expect to see lengthening effects on word-final consonants as well (Berkovits 1993a,b).

McCollum
To further examine the plausibility of phonetic lengthening, I compared the durations of word-final consonants with non-final coda consonants, specifically /m/ and /r/, in a model predicting duration from position with random intercepts for speaker and word length (in syllables). These two consonants occurred in codas in both final and non-final syllables. For both /m/ and /r/, tokens in word-final position were actually significantly shorter than tokens in non-final codas, which is shown in Figure 4 [m (n = 805): β = -32.7, χ 2 (1) = 67.7, p < .001; r (n= 489): β = -8.7, χ 2 (1) = 6.6, p = .01]. These findings are not consistent with an effect of phonetic lengthening, since the effect should be strongest on the segment immediately preceding the prosodic boundary. In this case, this predicts that word-final /m/ and /r/ should be significantly longer than their nonfinal coda counterparts, but the opposite was true, word-final consonants were actually shorter. Although final lengthening predicts a significant durational effect at the right edge of words collected in isolation, there are good reasons to reject a final lengthening reanalysis of the Uyghur data. The lengthening of word-final high vowels far outstrips that of word-final low vowels, even though this phonetic effect should not target subsets of the phonological inventory. Additionally, phonetic lengthening would predict increased duration on word-final consonants, too, but the data do not align with this prediction. Since results do not conform to a phonetic account of the data, I conclude that the pattern is phonological in nature and deserves a phonological account.

Optimality theoretic account
The analysis sketched out in this section attempts to account for three basic facts: stress falls on the final syllable, stress induces lengthening of [+hi] vowels in word-final position only, and medial vowels are raised as a form of unstressed vowel reduction. The first of these, the finality of stress is straightforward. Given the constraints in (6), so long as ALLFEET-R >> PARSE-σ and IAMB >> TROCHEE, stress falls on the final syllable.
(6) ALLFEET-R: assign a violation to every foot whose right edge is aligned to the right edge of the word PARSE-σ: assign a violation to every syllable that is not parsed into a foot IAMB: assign a violation to every foot whose head is not aligned to the right edge of the foot TROCHEE: assign a violation to every foot whose head is not aligned to the left edge of the foot More interestingly, to account for the asymmetric lengthening of word-final high vowels, a set of constraints referring to syllable weight is necessary, defined in (7). So long as WEIGHT-TO-STRESS and WEIGHT-BY-POSITION are undominated, the language will require all stressed syllable to be heavy and all codas to contribute a mora.
(7) WEIGHT-TO-STRESS (W2S): assign a violation to every stressed syllable that is monomoraic WEIGHT-BY-POSITION (WBP): assign a violation to every coda consonant does not possess a mora *HEAVY: assign a violation to every syllable that has more than one mora. ID-IO[hi]: assign a violation to every input-output vowel pair that disagree for the feature [high] In addition to the above constraints, I adopt two stringency constraints (de Lacy 2004(de Lacy , 2006 in (8) to model the interaction of vowel sonority and weight. Since only high and non-high vowels are distinguished in Uyghur, I don't attempt to develop a fuller encoding of these differences, although the reported differences in central versus high, and mid versus low could be exploited to derive a fuller set of constraints.  (9), but can also account for vowel raising. In (11), the second-syllable vowel is [-hi] and the ranking of *HEAVY over ID-IO[hi] induces raising to [i] to reduce the number of heavy syllables in the word (11c). 2 (11) /tikaedae/ W2S *HEAVY ID-IO[hi] a. ti μ .kae μμ .dae μμ **! b. ti μ .ki μμ .dae μμ **! *  c. ti μ .ki μ .dae μμ * * Recall from (5) that [-hi] vowels in medial closed syllables do not undergo raising, and this also falls out from the particular constraints and their ranking. So long as we assume a highly-ranked MAX-C constraint, the fact that *HEAVY does not differentiate between bi-and trimoraic syllables favors the faithful candidate in (12).

Conclusion
In sum, I have argued that stress is final in Uyghur, in conformity with several descriptions of the language. Results from a production study support the finality of stress. In particular, the extreme lengthening of word-final high vowels prompted an analysis of phonological lengthening of final [+hi] vowels. The fact that the pattern targets a subset of the vowel inventory, and does not augment word-final consonants suggests that the pattern is phonological and not reducible to phonetic final lengthening. The phonological analysis developed in Section 4 centers around moraic weight, evidenced by compensatory lengthening and the domain of medial vowel raising. The OT analysis sketched in Section 6 was able to account for asymmetric lengthening of word-final [+hi] vowels, as well as the raising of medial [-hi] vowels in open syllables.
Evidence from Uyghur is thus consistent with the claim advanced in Shih & de Lacy (2019), namely that sonority distinctions play only an indirect role in stress. Sonority is never referenced directly in the analysis above, but rather is formally encoded as a weight distinction. To-date, Uyghur provides the best evidence for sonority's influence on a language with fixed stress placement. Future work is necessary to corroborate these initial findings, and to experimentally evaluate stress-related claims in other languages.