Variation, the Height Effect, and Disharmony in Hungarian Front/Back Harmony

Many vowel harmony systems systematically tolerate a certain degree of disharmony within the harmonic domain in that some of the vowels frequently occur in sequences of syllables where they disagree in the harmonic property with the vowel(s) of (an) adjacent syllable(s). These vowels are the harmonically neutral vowels (N) in these systems. The vowels that do not or rarely (unsystematically) occur in disharmony are the harmonic vowels (front (F) and back (B) in a front/back harmony system). In Hungarian front/back harmony there are four neutral vowels (the unrounded front vowels i , i, e , ː ː ɛ), which show a gradience in neutrality (from most neutral to least neutral): i(ː) > eː > ɛ. This is known as the Height Effect (Hayes & Cziráky Londe 2006). Hungarian has the following sources or contexts of frequent disharmony with neutral vowels:


The Height Effect
Many vowel harmony systems systematically tolerate a certain degree of disharmony within the harmonic domain in that some of the vowels frequently occur in sequences of syllables where they disagree in the harmonic property with the vowel(s) of (an) adjacent syllable(s). These vowels are the harmonically neutral vowels (N) in these systems. 1 The vowels that do not or rarely (unsystematically) occur in disharmony are the harmonic vowels (front (F) and back (B) in a front/back harmony system). In Hungarian front/back harmony there are four neutral vowels (the unrounded front vowels i , i, e , ː ː ɛ), which show a gradience in neutrality (from most neutral to least neutral): i(ː) > eː > ɛ. This is known as the Height Effect (Hayes & Cziráky Londe 2006). Hungarian has the following sources or contexts of frequent disharmony with neutral vowels: i. mixed root phonotactics: N vowels freely combine with B vowels in roots […BN…], […NB…], e.g. h mi ɑ ʃ 'false', vila g ː 'world', ta e r ːɲ ː 'plate', pe ld ː ɑ 'example', fot l ɛ 'arm-chair', s rd ɛ ɑ 'Wednesday') 2 ii. antiharmony: some monosyllabic N roots require the B alternant of a harmonically alternating suffix [N]B, e.g. di j-t ː ɑ (*di j-t ː ɛ ) 'prize-ACC', he j-t ː ɑ (*he j-t ː ɛ ) 'peel-ACC') iii. transparency: after a B vowel, an N vowel is followed by a B vowel […BN]B, e.g. ko i-n k ʧ ɑ (*ko i-n k ʧ ɛ ) 'car-DAT', k te j-n k ɑʃ ː ɑ (*k te j-n k ɑʃ ː ɛ ) 'castle-DAT' iv. invariance: the harmonically invariant suffixes almost exclusively have N vowels: […B]N, e.g. ha z-ig ː 'house-TERM', ha z-e rt ː ː 'house-CAUS'.
The Height Effect has been studied extensively in the literature (e.g. Hayes & Cziráky Londe 2006, Hayes et al. 2009, Rebrus & Törkenczy 2015, 2016a, 2019. Most of these analyses focus on one of its aspects only, transparency: the less neutral a vowel is, the less transparent it is. These studies ignore the other aspects of disharmony -although the Height Effect manifests itself in most of these contexts and does so cumulatively. Consider Table 1 where "+" means a frequent source of disharmony with the relevant vowel. As can be seen (i) there is no difference between the neutral vowels 3 in root phonotactics: they all combine freely with back vowels in roots. There are differences in (ii) antiharmony: antiharmonic roots contain nonlow neutral vowels, (there are far fewer with mid e than with high i and i ), and hardly ever contain low ː ː ; (iii) ɛ transparency: the high neutral vowels are fully transparent but the nonhigh ones may be opaque; and (iv) invariance: high neutral vowels i and i occur in harmonically invariant suffixes, mid e occurs both in ː ː invariant and alternating ones and low only occurs in harmonically alternating suffixes. Considering ɛ * This work was supported by the National Scientific Grant NKFI-119863 'Experimental and theoretical investigations of vowel harmony patterns'. 1 In spite of what is usually assumed (e.g. van der Hulst & van de Veijer 1995, van der Hulst 2016, neutrality may or may not derive from unpairedness in the vowel inventory (e.g. Kiparsky & Pajusalu 2003, Törkenczy 2013, Ozburn 2019). Here we define neutrality empirically (based on the frequency of disharmonic sequences) and not as a derivative of (some property of a) phonological/underlying representation. 2 We will use […] to indicate stem domains and disregard consonants in formulae.
3 As is usual, we do not distinguish between long iː and short i as they represent the same degree of neutrality.
pluses as prototypical neutral behaviour we get an impressionistic measure (the "N-score") of gradience shown in the last column. N~B stands for the N vowel of the front alternant of a harmonizing suffix (e.g. -ne l ː 'ADE', cf. -na l ː , and -n k ɛ 'DAT', cf. -n k ɑ ). ː ta e r-n k ːɲ ː ɑ (*ta e r-n k ːɲ ː ɛ ) 'plate-DAT', but slove n-n k ː ɑ /n k ɛ 'Slovene-DAT'; […B ] ɛ hot l-n k ɛ ɑ /n k ɛ 'hotel-DAT', but kon rt-n k ʦɛ ɛ (*kon rt-n k ʦɛ ɑ ) 'concert-DAT'. In these cases we find the same hierarchy of neutrality: the less neutral a neutral vowel is the more lexical variation occurs, i.e. the more harmonically (somewhat) differently behaving groups of stems can be found whose vocalic makeup is harmonically identical. There are no lexical subgroups based on differences in harmonic behaviour within […BN] stems when N is i(ː), there are several subgroups when N is ɛ and eː is intermediate in this respect too.
The connections between these aspects of the Height Effect (the four contexts of disharmony and lexical variation) have never been explored. We claim that these manifestations are not independent but are in a complex relation based on the frequency of the relevant contexts. In this paper we examine the relationship between two aspects of the Height Effect: disharmony associated with invariance in suffixes ((iv) in Table 1) and the gradience in transparency in roots ((iii) in Table 1, the aspect of the Height Effect that is usually analysed in the literature). We will show that there is a parallelism between the differences in the distribution of the various neutral vowels in invariant suffixes and the Height Effect as manifested in the gradience of transparency in roots and will argue that the latter is in fact motivated by the former due to a general constraint, which we will call here Harmonic Consistency, 4 which regulates the harmonic behaviour of morphologically complex contexts, i.e. suffixed forms (Rebrus & Szigetvári 2016, Rebrus & Törkenczy 2016b, 2019.
In section 2 we describe data, the parallelism between the Height Effect in transparency and invariance in more detail. In Section 3 we discuss Harmonic Consistency and its effect on reliability in predicting harmony. In section 4 We propose an analysis of the parallelism between transparency and invariance and identify problems for future research.

Transparency, invariance and the Height Effect
As we pointed out above there is a parallelism between the application of the Height Effect in transparency and suffix invariance in that it results in the arrangement of the neutral vowels in the same hierarchy in both these aspects of disharmony (i(ː) > eː > ɛ). However, the Height Effect manifests itself differently in transparency and invariance: in the former it shows up in vacillation and in the latter in the type frequency of invariant suffixes.

Transparency
Gradience in transparency is the most studied and the most strongly established/ verified aspect of the Height Effect (most authors use the term to refer to gradience in transparency only, e.g. (Hayes & Cziráky Londe 2006). Here the Height Effect manifests itself in vacillation between harmonic suffix alternants, i.e. the state of affairs when […BN] stems occur in doublets of word-forms where one has the back and and the other the front alternant of a harmonically alternating suffix (e.g. hot l-ɛ n k/n k ɑ ɛ 'hotel-DAT'). The more open a neutral vowel, the more likely it is that vacillation occurs and the higher the probability of the form with the F alternant. There is virtually no vacillation after […Bi( ) ː ] stems 5 (e.g. ko i-n k ʧ ɑ /*n k ɛ ), some […Beː] stems vacillate, others do not (e.g. bohe m-n k/n k ː ɑ ɛ 'free spirit-DAT' vs. ta e r-n k ːɲ ː ɑ /*n k ɛ 'plate-DAT') and […Bɛ] stems typically vacillate (e.g. hot l-n k/n k ɛ ɑ ɛ ). Corpus studies have shown that the Height Effect can be detected in lexical (type) frequencies: Hayes & Cziráky Londe (2006) ran a Google-based search for word-forms with the harmonically alternating dative suffix (-n k~n k ɑ ɛ ) and Rebrus & Törkenczy (2016a) counted word-forms containing any alternating suffix in the Szószablya webcorpus (which contains 541 million word tokens Halácsy et al. 2004). Psycho linguistic studies also confirm that native speakers respond stochastically when tested on the Height Effect and their responses aggregately match the lexical frequencies in a wug test (Hayes & Cziráky Londe 2006) and in an elicited production task disguised in the form of a sentence repetition task (Benkő et al. 2018, Patay et al. in press).

Invariance
The differences in the distribution of neutral vowels between harmonically alternating and invariant suffixes have been noted in the literature (cf. Törkenczy 2016), especially the fact that the low vowel ɛ does not occur in invariant suffixes (which some authors who assume that neutrality is categorical take as evidence for the non-neutrality of ɛ, e.g. Ringen 1978). The parallelism, however, typically is not part of the analyses proposed.
We have summarized gradience in invariance in categorical terms (showing whether a given neutral vowel occurs or does not occur in invariant vs. alternating suffixes) in Table 1 above. The Height Effect is even more clearly visible when we examine the distribution of neutral vowels in harmonically alternating vs. invariant suffixes in detail: i(ː) occur in 12 invariant suffixes but only 1 (suppletive) harmonically alternating one (hence the star in Table 2), "less neutral" eː occurs in 9 invariant suffixes and 15 alternating ones and "least neutral" ɛ does not occur in invariant suffixes at all, 6 but occurs in 62 harmonically alternating ones. We get an even sharper picture if we only consider "non-terminal" suffixes, i.e. those that can be followed by another suffix: . This distribution means that there is a high number or word-forms in which a neutral vowel occurs in disharmony: the more frequently a vowel occurs in invariant suffixes, especially in non-terminal invariant ones, the more frequently we can find it in disharmony. Thus, in accordance with the Height Effect, vowels higher in the neutrality scale occur more frequently in disharmony in polymorphemic contexts than the less neutral ones.

Reliable and less reliable contexts for harmony, Harmonic Consistency
Assuming that the target of harmony is a harmonically alternating suffix, there are more and less reliable contexts for predicting which of its suffix alternants a stem will take.
A vowel of the harmonic class B or F immediately preceding the suffix makes a fully reliable context for harmony since the inferences for the harmonic value of the suffix are unambiguous. This is shown in (1), where N~B is the N vowel of an alternating suffix (its front alternant).
(2) Reliable contexts for harmony: polymorphemic stem + harmonizing suffix kor-ok-ro l ː /*rø l ː 'age-PL-DEL' kor-ok-rɑ/*rɛ 'age-PL-SUBL' kor-ok-na l ː /*ne l ː 'age-PL-ADE' kør-øk-rø l ː /*ro l ː 'circle-PL-DEL' kør-øk-rɛ/*rɑ 'circle-PL-SUBL' kør-øk-ne l ː /*na l ː 'age-PL-ADE' ø z-k-rø l ː ɛ ː /*ro l ː 'deer-PL-DEL' ø z-k-r ː ɛ ɛ/*rɑ 'deer-PL-SUBL' ø z-k-ne l ː ɛ ː /*na l ː 'deer-PL-ADE' By contrast, a preceding N vowel is an unreliable context for harmony because either a back or a front alternant can follow. In (3) below we show this for monomorphemic contexts (i.e. when N is not in a suffix), separately for i(ː), eː and ɛ. We indicate the back and front alternants (harmonic values) a harmonically alternating suffix takes in the given context as B and F. Note that F as a harmonic value of the harmonizing suffix (but not as part of an environment) includes an N vowel that is the front alternant of a harmonically alternating suffix, i.e. N~B (e.g. eː in the front alternant of adessive -ne l ː , which alternates with -na l ː . In tables "any vowel" (abbreviated as "any") means that either value can occur in the given context due to lexical variation or vacillation, i.e. the given context underdetermines suffix harmony. Examples: ko i-n k ʧ ɑ 'car-DAT' røvid-n k ɛ 'short-DAT' ta e r-n k ːɲ ː ɑ 'plate-DAT' køve r-n k ː ɛ 'fat-DAT' hot l-n k/n k ɛ ɑ ɛ 'hotel-DAT' ør g-n k ɛ ɛ 'old-DAT' (3) shows that no 9 reliable predictions can be made about suffix harmony based on a preceding N vowel. When further details are known about the string before the N (the vowel(s) of the preceding syllable(s), whether there is a preceding syllable), inferences become more reliable. Consider (4) We can see in (4) that 5 out of the 9 possible inferences about the harmonic value of a suffix are reliable when the vowel preceding the stem-final N is also known. When there is no preceding vowel ([N]_), i.e. the stem is monosyllabic, 1 out of the possible 3 inferences is reliable (there are no monosyllabic antiharmonic roots with ɛ). Altogether, half of the 12 possible inferences are reliable. Although this context is still not as reliable as those in (1) and (2), (4) is a significant improvement over (3), where no reliable inferences can be made. Interestingly, however, as opposed to the (reliable) contexts in (1) and (2), there is a difference in reliability between the monomorphemic contexts in (3) and (4) on the one hand and the corresponding polymorphemic contexts where N is in an invariant suffix. Specifically, as we shall see below, the polymorphemic contexts are more reliable that the monomorphemic ones. In (5) we can see that there are no invariable suffixes with the neutral vowel ɛ -however, ɛ in a suffix alternant is a reliable context for harmony, because in this case it is the front alternant of the suffix (N~B) and therefore only the front value (F) is possible for a following harmonically alternating suffix, see the example føld-k-tø l ɛ ː and (2) above. (Glosses are missing for the words already given in (5).) 9 For the sake of simplicity we disregard here gradience in ambiguity although the different unreliable contexts are statistically not equally unreliable (e.g. due to the Height Effect). Taking them into consideration would yield a more realistic picture, but we leave this for future research. 10 Distinguishing between i( ), e , ː ː ɛ as V1 would not make a difference in this context, but what really counts is the vowel before the two Ns: for B the suffix is vacillating ([BNN Examples: ha z-i-to l, pa l-e -to l; føld-i-tø l; y r-e -tø l; føld-k-tø l In the context in (6) there are only 8 possible inferences (if we include those where ɛ occurs in the front alternant of a harmonic suffix). The missing four contexts (compared to context (4)) are not possible because (a) the final N in the context must be preceded by some syllable since a word cannot consist of just a suffix and all stems contain at least one vowel; and (b) [[…B]ɛ] is not licit since ɛ only occurs as the front alternant of a harmonic suffix and as such cannot follow a stem whose final syllable has B (see (1) and (2) above). 6 of the 8 possible inferences are reliable -better than those of the corresponding monomorphemic context (4). We have seen that the harmonic context N_ is unreliable and it becomes more and more reliable, the more is known about the vowels preceding it. We have only examined how the identity of the vowel of the syllable immediately preceding N reduces harmonic ambiguity. Knowing another preceding vowel would reduce ambiguity further, 13 but for simplicity's sake, we do not consider this and other "richer" contexts here.
The main point we want to make here is that polymorphemic contexts are more reliable than the corresponding monomorphemic ones. This is due to Harmonic Consistency, a paradigm uniformity constraint that applies to harmony in multiply suffixed forms. Harmonic Consistency (cf. Rebrus & Szigetvári 2016, 2019) means that in Hungarian stems are consistent in their harmonic behaviour in that all alternating suffixes (derivational or inflectional) behave in the same way harmonically (they take F, B or vacillating F/B allomorphs) when attached to the same stem. Accordingly, the harmony of the root is preserved in suffixed forms: the harmonically suffixed forms of a stem are all back, all front or all vacillating. 14 This can be expressed as (7): (7) Harmonic Consistency (HC) All the harmonically alternating suffixes have identical harmonic values (F, B or F/B) within the (extended) paradigm of a stem.
HC restates phonological harmony (applies vacuously) in multiply suffixed forms with harmonically alternating suffixes only: the final suffixes of multiply suffixed bor-ok-to l ː 'wine+PL+ABL', ør-øk-tø l ʃ ː 'beer+PL+ABL' behave exactly like those in singly suffixed torok-to l ː 'throat+ABL', tørøk-tø l ː 'Turk+ ABL'. However, HC may override phonological harmony in multiply suffixed forms which also contain invariant suffixes. Specifically, it overrides the Height Effect for transparency, 15 which we illustrate in Table 3. 11 There is no such context, see the discussion below (6). pa l ː pa l-n k, pa l-to l, pa l-uŋk, ː ɑ ː ː ː … (*pa l-nεk ː , *pa l-tø l ː ː , *pa l-yŋk ː , …) pa l-e k-n k ː ː ɑ *pa l-e k-nεk ː ː slove n-n k ː ɑ slove n-n k ː ɛ pa l-e -n k ː ː ɑ *pa l-e -nεk ː ː va r ː va r-h t, va r-o k, va r-uŋk, Table 3. Harmonic Consistency In Table 3 we can see that the Height Effect, which applies to a [Beː] root (column IV) is overridden by HC inhibiting vacillation after an invariant suffix (familiar plural -e k ː , anaphoric possessive -eː, or conditionalne k ː in column III that is added to a root which is back-harmonic (as shown by their paradigms in column II) in spite of the fact that the vocalic makeup of the stems preceding the final suffix in the multiply suffixed form (column III) and the singly suffixed form (column IV) are harmonically identical.

Motivating the parallelism between transparency and invariance
The transparency aspect of the Height Effect has been the subject of OT analyses. In order to capture the degrees of neutrality, these models employ harmony constraints relativized to the different neutral vowels, which may be ranked above/below some other constraints with different probabilities (Stochastic OT, Hayes & Cziráky Londe 2006) or can have different weights (MaxEnt grammar, Hayes et al. 2009) or they use different trigger/target scaling factors specific to the various neutral vowels (Harmonic Grammar, Bowman 2013, Ozburn 2019. While these analyses can handle the Height Effect as manifested in degrees of transparency (vacillation), they disregard its application to invariance and the parallelism between transparency and invariance. The main problem for modelling the parallelism in OT in these ways is that although the direction of the gradience is the same in both transparency and invariance (i(ː) > eː > ɛ), the way neutrality manifests itself is different. In transparency neutrality is the probability of vacillation and the likelihood of a front suffix alternant, but in invariance it is the type frequency of invariant suffixes with the different neutral vowels. For this reason OT constraints seem unsuited to capturing the parallelism.
We outline a different approach below in which we suggest that the parallelism is due the frequency with which the different N vowels occur in disharmony: N vowels higher up in the neutrality scale are more likely to occur in disharmony than those lower down.
We have seen in §2.2 that (a) the Height Effect applies to suffix invariance in that an N vowel higher in the neutrality scale occurs more frequently in disharmony in polymorphemic contexts than the less neutral ones since the former occurs more frequently in (non-terminal) invariant suffixes. Furthermore, we have seen in §3 that (b) polymorphemic contexts with N-vowel suffixes are more reliable (less ambiguous) contexts for harmony than the corresponding similar monomorphemic contexts. This means that (8) An N vowel higher in the neutrality scale occurs more frequently in reliable harmonic contexts than a less neutral N vowel.
In an analogical approach to the parallelism problem (e.g. Blevins & Blevins 2009), where the basic assumption is that (similar) patterns influence one another and the degree of influence is proportionate to the strength of a pattern which is determined by its frequency, it is also reasonable to assume that a reliable pattern can serve as source of analogy for ambiguous harmony contexts Thus, there is a difference in strength between the reliable patterns involving different neutral vowels as source patterns. Furthermore, we can also assume that a strong reliable pattern will have a more salient analogical effect on a similar but less reliable target pattern, the weaker (less frequent) the target pattern is. Such a difference is clearly detectable if we compare the (token and type) frequencies of N vowels in the ambiguous monomorphemic target contexts [ The preliminary statistics 16 in Table 4 confirms this relationship: (i) the reliable source contexts are clearly more frequent than the ambiguous target ones (in token, but even more markedly in type frequencies) and (ii) reliable and unreliable contexts with higher neutral vowels are more frequent than with lower ones (and there is no such context with ɛ). Our analysis is based on these relationships: we assume that the stronger a reliable source pattern is (the more […B]N]_ word forms (types) occur compared to […BN]_ stems), the more a weaker, similar but ambiguous context ([…BN]_) will pattern after the source pattern ([…B]N]_). Conversely, the weaker the reliable source pattern is, the less it can reduce ambiguity of behaviour in an unreliable context by providing a reliable pattern. We claim that ambiguity manifests itself in variation: in the case of harmony, vacillation and lexical variation (i.e. subgroups of stems with identical vocalic makeup behaving harmonically differently).
This is exactly what we find with the Height Effect as manifested in transparency and invariance: 1. There is no variation in transparency (no vacillation or harmonically (somewhat) differently behaving lexical items) with the high N vowels i( ) ː in the monomorphemic context […Bi( ) ː ]_ because there is a robust reliable polymorphemic pattern […B]i( ) ː ]B] that items with the monomorphemic context can follow. 2. The reliable polymorphemic pattern […B]eː]B] is weaker for mid eː and thus we find both vacillation and lexical variation in the monomorphemic context […Beː]_. About half of the [… Beː] stems are vacillators (e.g. slove n-n k ː ɑ /n k ɛ 'Slovenian-DAT', norve g-n k ː ɑ /n k ɛ 'Norwegian-DAT', bohe m-b k ː ɑ /n k ɛ 'easygoing-DAT') and half of them only take the back alternants of harmonically alternating suffixes (e.g. somse d-n k ː ɑ /*n k ɛ 'neighbour-DAT', ta e r-n k ːɲ ː ɑ /*n k ɛ 'plate-DAT', k te j-n k ɑʃ ː ɑ /*n k ɛ 'castle-DAT'). The lexical variation between the two subgroups corresponds to a distinction between "new" items (recent loans, unfamiliar/ nonsense words), which tend to vacillate vs. "old" items (high frequency words, non-recent loans, words of Finno-Ugric origin), in which eː tends to be fully transparent (cf. Rebrus & Törkenczy 2019). 17 3. In the absence of a reliable polymorphemic pattern (due to the lack of invariant suffixes with ɛ), variation is rampant when the neutral vowel is low […Bɛ]_. Typically, […Bɛ] stems are 16 Data are from Szószablya webcorpus (Halácsy et al. 2004). Due to the nature of the corpus, only inflectional polymorphemic […B]N]_] word-forms could be counted and the count includes the terminal suffixes too; a more precise measurement would have to include the derivational suffixes as well. 17 Token frequency can "reclassify" stems: e.g. "old" but extremely rare ga e r ːʧ ː 'drake' vacillates (ga e r-n k/n k ːʧ ː ɛ ɑ 'drake-DAT') and "new" but very frequent koŋkre t ː 'specific' predominantly takes the back alternants of harmonic suffixes (koŋkre t-k ː ɑ /*? k ɛ 'specific-PL').