An Optimality Theoretic Analysis of Vowel Harmony in Kazan Tatar

Kazan Tatar is a Kipchak language spoken in the Republic of Tatarstan (Ethnologue). Previous literature has described a backness harmony system, with weak rounding harmony in the mid vowels (Comrie 1997, Berta 1998, Poppe 1968). This work utilizes novel data to investigate Tatar’s harmony under an Optimality Theory (OT) (Prince & Smolensky 1993) framework, contributing new observations regarding the lack of rounding harmony in Tatar, contrary to previous accounts. Through investigation of Tatar’s harmony system, we gain insight into the workings of the language’s phonology and find crucial evidence for the gradual decay of rounding harmony in Turkic languages.

This work seeks to investigate and analyze the present phonological behaviors of Tatar's vowel system.The analysis is based on transcribed speech data collected from two adult female native speakers of the language, who were 37 and 62 years old at the time of recording, respectively.The speakers are not related, do not know each other, and both grew up in Kazan, Tatarstan.Both speakers were monolingual until around five years old, approximately when they entered school and began to learn Russian.Each speaker read from a pre-determined word list consisting of nouns.The speakers provided various inflected forms of each word.The different suffixes were elicited using appropriate syntactic frames.The forms elicited include the nominative, the nominative plural, the dative, the ablative, and the second person plural possessive.Table 2.1 shows the sentence frames used.An initial analysis was conducted on the first speaker's data and verified by the transcribed data of the second.

Background.
In terms of previous literature, not much work has been done on Kazan Tatar.Two of the more well-known works are Nicholas Poppe's 1968 Tatar Manual, a descriptive grammar of Tatar, and Bernard Comrie's 1997 "Tatar Phonology," published in Phonologies of Asia and Africa: Vol 2. The rest of any accessible work is written in Russian, and phonological analyses and phonetic accounts in Russian are few and far between.In terms of published theses and dissertations that explore alternations, some more recent works are Jenna Conklin's 2015 thesis on long distance vowel assimilatory processes in Kazan Tatar, and Albina Davliyeva's 2011 thesis titled "An Investigation of Kazan Tatar Morphology."The language is also described throughout chapters of Lars Johanson's 1998 book, The Structure of Turkic, including a dedicated chapter to description of Tatar and Bashkir written by Árpád Berta.
The target wordlist was crafted from a variety of sources and verified as lexically valid by speakers.I began with examples listed in Comrie 1997 to verify his data claims.I added other items to the list using a Russian-Tatar dictionary (Ganieva et al. 2009), a Tatar-English virtual dictionary on a website called Glosbe.com, and when needed, Wiktionary entries.The words were selected to represent both multisyllabic and monosyllabic words, loanwords, and compound words.Once the speaker became acquainted with the wordlist, they first read it in the nominative (uninflected) form, and then provided the nominative plural.Following this, the speakers were asked to place the word into various sentence frames to get target case endings.The cases I targeted were based in part off of Comrie 1997, and   The nominative is present as the "neutral" data against which inflected forms are compared.Nominative plurals are well known as good examples for vowel harmony-driven allomorphy in the Turkic languages and are fast to elicit.More complicated examples are elicited using sentence frames.Comrie 1997 mentioned the second person plural possessive as a good place to find rounding harmony effects.The dative and ablative were added to the elicitation set in order to make sure that vowel harmony applies across multiple morphophonological domains, as there are some phonological processes in Tatar that only occur in specific cases.For example, Davliyeva 2011 notes that nasal assimilation processes only occur in the plural and ablative cases, and Comrie 1997 mentions these restricted assimilatory processes as well.Neither attest the dative to undergo these changes.As expected, all morphemes tested undergo vowel harmony, giving evidence that the process reapplies as agglutination occurs.
3. Kazan Tatar Phonetics.The phonemic inventory of Kazan Tatar, as presented in Comrie 1997, has twenty-five consonant phonemes and ten vowel phonemes.For the purposes of this vowel harmony analysis, Comrie's phonemic inventory is used preceding future plans for "bigdata driven" phonetic analysis of the language.There are three approximants, one trill, three nasals, ten fricatives, and eight stops.He uses the place terms "front velar" and "back velar" to highlight how consonants carrying features for [back]  It is worth noting that in my own acoustic analysis, I found that the vowel space is significantly more reduced than it is presented in the literature, even when controlled for stressed versus unstressed syllables.This has inspired forthcoming work on large-scale acoustic analysis of the Kazan Tatar vowel space using COVAREP, (Cooperative Voice Analysis Repository for Speech Technologies, Degottex et al. 2014), an open-source repository of speech processing algorithms.By using formant tracking and subsequent vowel-space mapping using vector quantization, a data-driven picture of Tatar's vowel inventory can be revealed.For the purposes of this analysis, however, I used Comrie's vowel inventory and mapped vowels to the closest respective ones to assess the vowel harmony system.In terms of consonants discrepancies from Comrie 1997, I am led to believe that Tatar has uvular fricatives and not velars due to the phonological patterning in the language.Phonotactics dictate that front vowels pattern with velar consonants, and back vowels pattern with uvular consonants, as seen below.

Vowel Harmony in Kazan
Tatar.Comrie 1997 describes the vowel harmony system of Tatar as being sensitive to features of backness and rounding, with the rounding process being gradient in the mid vowels.Poppe 1968 suggests this sort of process as well.Johanson 1998 presents a conflicting account amongst its chapters: in those Johanson wrote, he attests harmony in the mid vowels, whereas the chapter Berta 1998 attests that it is weakly developed.There was backness harmony found in my data, but contrary to previous accounts, there was no rounding harmony.Suffixes described as having four allomorphs only had two, one for [-back] and one for [+back].Conklin's 2015 thesis on long distance vowel assimilatory processes in Kazan Tatar was extremely helpful in conducting these analyses, as we found similar results.Conklin's data was consistent with my own findings that previous accounts incorrectly describe the present-day vowel harmony system of the language.Conklin and I found that Kazan Tatar only has backness harmony, while others assert there are both rounding and backness systems of harmony.
There is no phonologically acceptable word-internal disharmony allowed in native Tatar words.The domain over which vowel harmony rules govern in Tatar is the prosodic word, or PWd.The prosodic word in Tatar contains the lexical word root and all associated agglutinative suffixes assigned to the aforementioned root.Here we can see an example of a simple case of vowel harmony, demonstrated by the plural morpheme which can be underspecified as -lVr.
In each case, the vowel in the -lVr plural morpheme becomes specified in a value of either plus or minus backness based on the value of those in the word.Words that can sometimes, on the surface level, appear to be disharmonic but are actually not, are compound words.The "leftto-right nature," as Comrie describes, of the harmony process dictates that the lattermost syllable is what governs allomorph harmony.If boundaries are judiciously applied in analysis, then this left-to-right nature of the language can easily account for the "apparent disharmony" as well as which vowel is selected in allomorphs.

[-back]
[ In the case of loanword phonology, word-internal disharmony is permitted.However, the agglutinative suffixes must choose a feature specification of either plus or minus back.Most of the loanwords into Tatar are either Russian, Arabic, or Persian.Comrie 1997 mentions a general pattern of treating Russian words that contain both front and back values as [+back] and treating the vowels /e/ and /i/ in those words as neutral in the context of vowel harmony.Table 4.3: Loanword disharmony.
In the examples above, the word for locksmith, /slesɑr/, comes from Russian through German, and the word soviet (in terms of the Soviet-era noun approximately meaning 'council'), /sɑvjet/ comes from Russian.We see what appears to be word-internal disharmony in /sɑvjet/ by having both the front vowel /e/ and the back vowel /ɑ/ in the same word, but due to how Tatar treats the vowel /e/ as transparent, it is permissible./slesɑr/ is handled through constraints that permit skipping syllables, and is demonstrated later in this work.

Lack of Rounding
Harmony.An interesting finding from this study was the lack of rounding harmony found in the data.It is asserted as a gradient process in the mid vowels in most previous Conklin 2015's work also shows that there is no rounding harmony in the language.As for Comrie 1997, is worth note that Comrie mentions his work was completed using secondary sources and not primary data analysis (Comrie, personal communication, 5/30/2017), so there is no way of knowing if even weak rounding harmony was previously present, if it was still present by 1997, due to the lack of data.
When I tried to elicit words that were attested to contain rounding harmony, I used the same form as provided by Comrie 1997, the second person plural possessive -vGvz.If rounding harmony was present, then there would be four possible allomorphs: a front rounded, a front unrounded, a back rounded, and a back unrounded.However, the data only shows two surfacing allomorphs: a front and back, with no changes in respect to [α round] features of preceding vowels.The surfaced forms are -/egez/ and -/oɣoz/.

[-back]
[ Were rounding harmony an extant process in Tatar, these words would have to end in -/øgøz/and -/ɤɣɤz/ respectively.However, they do not match in values of roundness.There are only two surfaced allomorphs of this suffix, which is solid evidence for the lack of rounding harmony.

[-back]
[ In Conklin 2015, extensive and thorough phonetic data analysis is used to disprove the presence of rounding harmony.Methods based in both articulatory and acoustic phonetics are employed.The reason for such extensive analysis is to disprove an alternative to "no rounding harmony," which is weak or gradient rounding harmony.By the structure of words collected alone, it is obvious that there is no true rounding harmony in the language.Rounded and unrounded vowels coexist within native words freely; the language's only requisite being that match in terms of values of backness.And as mentioned above, only two allomorphs exist where rounding harmony would dictate the surfacing of four.
The hypothesis that this process could have existed in the past is reasonable, but we unfortunately do not have data capturing the specific decay process.McCollum 2016 describes the phenomenon of decay of rounding harmony across other Turkic languages.From this, it is plausible to think that rounding harmony is a less salient process than backness harmony is, and therefore it seems to make sense that rounding harmony would not be present at this time in history.

Optimality Theoretic Analysis.
Walker 2012 details the many ways vowel harmony can be handed under Optimality Theory with illustrated examples.For the purposes of this analysis, I chose a base set of constraints based upon Walker 2012's Turkish examples.The harmony driving constraint for Tatar's backness system is sᴘʀᴇᴀᴅ, which can be stated as "For all tokens of [back] in a prosodic word, if a token is linked to any segment, it is linked to all segments."It is paired with a faithfulness constraint of ɪᴅᴇɴᴛ-ɪᴏ([back]), or "corresponding segments in the input and output have identical values for the feature [back]."This is the base of the constraint set explaining the harmony system, and works for most cases as an adequate constraint set.We can observe that this analysis pans out for words in the singular, looking again to the earlier provided examples of /joldɯz/ 'star' and /ebi/ 'grandma':  The vowel harmony analysis holds if compound words are treated as two distinct units, rather than one single word, even when agglutinative processes occur.Note with the example below, there is also a nasal assimilation process (causing the plural of 'birthday' to be -nVr instead of -lVr).Given that Tatar words are generally well-behaved in terms of vowel harmony, the analysis holds for most of the language as long as it is native words that are being examined.In general, Tatar speakers also have demonstrated good sense about which words are loanwords and which are native Tatar words.The first consultant reported which words were from what language, and the second initially had difficulty with, in particular, the Russian words in isolation.Once sentence frames were provided she was able to put them in the proper grammatical declension, but prior to that she insisted they were "Russian words and not Tatar words," verbatim.Therefore, it is not unreasonable to speculate that for Tatar speakers, there exists separate strata with separate constraint sets for nonnative words.
To handle loanwords, more constraints must be added to the analysis so as to allow disharmony only in the cases of loanwords.To the base constraint set, *sᴋɪᴘ-σ is added to handle allowable transparency, as demonstrated in Walker 2012 for Finnish transparency.The constraint assigns a penalty when feature spreading skips an intervening syllable.When ranked below the harmony driving constraint, it is violable which is what allows the transparency.
Since another way to handle disharmony would be to simply delete the vowels that disrupt the dominant harmony feature, ᴍᴀx-ᴠ, which specifies that all underlying vowels in the input must have a correspondent in the output, is added as an undominated inviolable constraint to avoid deletion.To avoid changing the loanword in any way to prevent disharmony, the constraint ʟᴏᴀɴᴡᴏʀᴅ ᴄᴏʀʀᴇsᴘᴏɴᴅᴇɴᴄᴇ (hereon abbreviated as ᴄᴏʀʀ-ʟᴡ), where every segment in a loanword input must have a correspondent in the output (Tsuchida 1995) is added.Lastly, the loanword's agglutinative strata evoke the use of a specialized ɪᴅᴇɴᴛ constraint in order to source affix values for [α back] from the first syllable in the stem for loanwords: ɪᴅᴇɴᴛ-sσ1-ᴀ ([back]).This leads us to the following finalized constraint set: MAX-V -all underlying vowels must have a correspondent in the output LOANWORD CORRESPONDENCE -(abbreviated ᴄᴏʀʀ-ʟᴡ) -Every segment in the input must be in the output for loanwords (Tsuchida 1995).*SKIP-σassigns a penalty when feature spreading skips an intervening syllable.When ranked below the harmony driving constraint, it is violable to allow transparency.

IDENT-Sσ1
The selected examples illustrate how disharmonic loanwords are able to surface under this constraint set and ranking.
[sɑvjet]LW contrary to previous accounts which have attested rounding harmony to be present to some extent in the mid vowels.This analysis contributes new insight into the present-day state of Kazan Tatar vowel phonology and hints towards the lesser salience of rounding harmony as a phonological process.It is the underpinnings for planned future work on large-scale phonetic analyses and subsequent computational modelling of learning vowel harmony.
However, it is fully reasonable that if Tatar used to have rounding harmony, that it no longer does given the observed decay of such harmonic processes across languages in Turkic per McCollum 2016.Given that the decay is so noticeable across the language family, it suggests that rounding harmony is a much less salient process than backness harmony and is less likely to be preserved.While the lack of rounding harmony today unfortunately does not give much insight into how the decay process may have worked in Tatar, it does provide a crucial data point in adding to the list of languages in Turkic that were once attested to have rounding harmony processes, but no longer do, giving more credence to the decay phenomenon overall.
The domain over which vowel harmony applies is the prosodic word.Vowel harmony applies both word internally and after all agglutinative processes, suggesting that allomorphs are underspecified in terms of backness in their vowels until they undergo the harmony process.For native Tatar words, word-internal disharmony is not permitted.In the OT analysis presented, the primary harmony driving constraint is sᴘʀᴇᴀᴅ (demonstrated in Walker 2012 for Turkish).It is paired with ɪᴅᴇɴᴛ-ɪᴏ constraints sensitive to the feature of backness to create a simple yet elegant analysis that works for most native word cases.
Cases where things become trickier involve compound words, which behave as two phonologically separate units given that they can look, at surface value, disharmonic.But if each component of the compound is treated as its own prosodic word by judiciously applying boundaries and holding to the left-to-right agglutinative process of the language, then the analysis holds for both unaffixed and affixed compounds.
The constraint set needs to be expanded in the case of loanwords.Given Tatar speaker's intuitive knowledge of if a word is a native Tatar word or not, it is reasonable to assume there is a different stratum with different constraint rankings for loanwords, specifically.The base harmony driving constraint (sᴘʀᴇᴀᴅ) and its faithfulness constraint (ɪᴅᴇɴᴛ-ɪᴏ) remain at the core of the analysis, but some new constraints are ranked above and below it.*sᴋɪᴘ-σ is ranked below the harmony driving constraint to allow transparency of certain vowels in loanwords, as demonstrated in Walker 2012 for Finnish.ᴍᴀx-ᴠ is added as an undominated constraint to rule out vowel deletion.ʟᴏᴀɴᴡᴏʀᴅ ᴄᴏʀʀᴇsᴘᴏɴᴅᴇɴᴄᴇ (Tsuchida 1995) preserves the segments of loanwords from input to output, and a specialized ɪᴅᴇɴᴛ-ɪᴏ constraint is added to force optimal candidates to source feature value information from the first syllable of the root.
When applied in the displayed order, the surface forms are clear winners.Optimality Theory works quite well in handling both native loanword phonology and cases of exceptionality in the Kazan Tatar language, and additional phenomena are to be explored using this framework as future work.
Figure 6.1: Word internal harmony cases.
Compound word allomorphy, demonstrated with 'birthday' and 'belt', now in the plural.
-A([back]) -Source affix feature value information for [back] from the first syllable in the stem.SPREAD ([back], PWord) -For all tokens of [back] in a PWord, if a token is linked to any segment, it is linked to all segments IDENT-IO([back]) -Corresponding segments in the input and output have identical values for the feature [back].
then added additional cases to see if the phenomenon extended beyond those showcased in Comrie's writeup.
Loanword tableaux demonstrating 'soviet' and 'locksmith' in the singular.This analysis holds for allomorphy in agglutinative processes, as well.The addition of the specialized ɪᴅᴇɴᴛ-sσ1-ᴀ([back]) constraint matters here such that the correct feature for [α back] is sourced.Kazan Tatar's vowel system has harmony processes sensitive to features of backness, but not roundness nor other noticeable features.This coupled with Conklin 2015 is