Gradient categories in lexically-conditioned phonology: An example from sound symbolism*

Sound symbolism highlights the interaction between categories and the different phonological patterns that arise between categories. Such interaction between phonologically-external categories and phonological patterns also exists in “core” morphophonology. In particular, lexically-conditioned phonology is exactly this situation, where different lexical categories often exhibit different phonotactic patterns or phonological alternations. For example, content words in English (such as nouns) are known to exhibit greater phonotactic contrasts than function words, whereas function words exhibit greater phonological reduction in running speech than content words (e.g., Selkirk 1984, 1996; Inkelas & Zec 1993; Kelly & Bock 1988; Kelly 1992; Segalowitz & Lane 2000; Bell et al. 2009; Smith 2011, 2016; Shih 2014, 2018). The categories that have been noted to be relevant for lexically-conditioned phonology in fact already include some that border on non-arbitrary patterns. While many times, lexical conditioning comes from morphosyntactically-defined categories (e.g., content versus function words; parts of speech) or arbitrary gender class systems, sometimes, categories such as ideophones or sex (e.g., male/female) also engender differences in phonological behavior. As such, sound symbolic data turns out to be a useful sandbox in which to explore phonological behaviours resulting from lexical conditioning. This paper offers a natural extension of our existing formal model for lexically-conditioned phonology, based on patterns in sound symbolic behaviours and illustrated by a dataset of male and female American English names. The paper is organized as follows. Section 2 sets up the theoretical environment, presenting an approach to lexically-conditioned phonology in Maximum Entropy Harmonic Grammar with a toy illustration. Section 3 presents a proposal to extend this approach to gradient category memberships. Finally, §4 concludes.


Introduction
Sound symbolism highlights the interaction between categories and the different phonological patterns that arise between categories. Such interaction between phonologically-external categories and phonological patterns also exists in "core" morphophonology. In particular, lexically-conditioned phonology is exactly this situation, where different lexical categories often exhibit different phonotactic patterns or phonological alternations. For example, content words in English (such as nouns) are known to exhibit greater phonotactic contrasts than function words, whereas function words exhibit greater phonological reduction in running speech than content words (e.g., Selkirk 1984Selkirk , 1996Inkelas & Zec 1993;Kelly & Bock 1988;Kelly 1992;Segalowitz & Lane 2000;Bell et al. 2009;Smith 2011Smith , 2016Shih 2014Shih , 2018. The categories that have been noted to be relevant for lexically-conditioned phonology in fact already include some that border on non-arbitrary patterns. While many times, lexical conditioning comes from morphosyntactically-defined categories (e.g., content versus function words; parts of speech) or arbitrary gender class systems, sometimes, categories such as ideophones or sex (e.g., male/female) also engender differences in phonological behavior. As such, sound symbolic data turns out to be a useful sandbox in which to explore phonological behaviours resulting from lexical conditioning. This paper offers a natural extension of our existing formal model for lexically-conditioned phonology, based on patterns in sound symbolic behaviours and illustrated by a dataset of male and female American English names. The paper is organized as follows. Section 2 sets up the theoretical environment, presenting an approach to lexically-conditioned phonology in Maximum Entropy Harmonic Grammar with a toy illustration. Section 3 presents a proposal to extend this approach to gradient category memberships. Finally, §4 concludes.
Each constraint thus is duplicated for every category, and each category-indexed constraint carries a weight separate from the "base" constraint weight (i.e., the non-indexed form of the constraint).
The following section illustrates lexically-conditioned phonology using this approach.

2.1
Toy illustration: Male versus female names Phonotactic differences between male and female names in English have long been noted in the literature (e.g., Cassidy et al. 1999;Wright et al. 2005), and often pattern with sound-symbolic associations. Here, I use a dataset of the 200 most frequent male and 200 most frequent female names, from the Social Security Administration's list of American English names for U.S. births between 1990-1999. As a toy illustration, I focus on two of the strongest predictors of male and female names in the current dataset (from a list of male/female name differences from the previous literature, as tested with the MaxEntGrammarTool; Hayes et al. 2009).
In the current dataset, female names are significantly more likely to avoid final stop obstruents than male names, shown in Figure 1.  Male names in the dataset are significantly more likely to begin with an initial stressed syllable (i.e., roughly, be trochaic), as shown in Figure 2. For example, Elaine, a female name, is [n]-final and features iambic stress, while Albert, a male name, is [t]-final with trochaic, initial stress.
These preferences can be modeled using the following constraints (3) and (4): (3) *T# Penalize every name that ends with a final stop obstruent. (4) TROCH Penalize every name that does not begin with an initial stressed syllable.
Because male and female names behave differently, there needs to be lexically-indexed versions of these constraints, weighted for each lexical class, as shown in (5): The "base" grammar would thus consist of the constraints that are not lexically indexed. The cophonology for male names consists of constraints that are lexically indexed for -'+. , while the cophonology for female names consists of constraints that are lexically indexed for /.0'+. . The tableau in (6) provides an illustration, with hand-weighted constraints, of the optimal selection for a phonological input that has the shape /CV.CVT/, given female or male name gender affiliation. A male name input is given in (6a) and a female name input is given in (6b).
As shown in (6), a lexically-indexed TROCH♂constraint to male names is contributes an additional weight, and rules out any iambic candidates (CV.ˈCVN). In the same grammar, base TROCH cannot be weighted too high because iambic candidates do win-just not in the male name portion of the grammar. There are also two constraints indexed for female names: *T#♀ and WSP♀. The former provides extra penalization for name-final stop obstruents, as in the candidate ˈCV.CVT. The female name-specific WSP♀ ensures the algebraically-higher harmony score for a candidate with iambic stress on the heavy final syllable, versus trochaic stress. Thus far, the currently-available approaches to lexically-conditioned phonology appear to work well, even for patterns originating from sound symbolism, as demonstrated by the male and female name toy illustration. Naturally-occurring sound symbolic patterns, however, require more gradience in category structures, as demonstrated by the case studies in the following section.

Theoretical ramifications: Gradient lexical category membership
As presented in §2, there are a number of existing approaches to lexically-conditioned phonology, including lexically-indexed constraints, strata, cophonologies, and sublexical grammars. One feature that nearly all of these approaches share is the assumption of crisp, discrete boundaries in lexical category membership. For instance, a word belongs either to the content word or function word class; a part of speech is either noun or verb; a morpheme is Latinate or not; a name is either male or female. Even analyses that use lexically-indexed constraints for the "expressive" lexicon assume this type of rigid category membership at work (e.g., Alderete & Kochetov 2016;. Recent work in sound symbolism, however, has demonstrated that not all behaviours correspond to crisply delineated category membership. For example, work on sound symbolic correspondences between Pokémon names and their characteristics have demonstrated phonotactic patterns that scale with how evolved, how heavy, and how tall a character is (e.g., Kawahara (Shih & Rudin 2019). Many of these categories that correspond to sound symbolic phonotactic patterns are not categorical: for example, weight, height, power for both Pokémon and baseball players. Non-categoricity in category membership is in fact not a new issue in linguistics. See, for instance, the rich literature on scale structure in semantics (e.g., Kennedy & McNally 2005), which notes that some adjectives allow categorical membership (e.g., an entity can be either alive or dead, but not both or in between) whereas others do not have as clear-cut distinctions (e.g., whether an entity is tall can vary).
How, then, do our grammatical models handle gradient category membership in lexically-conditioned phonology if the existing mechanisms that we have assume categorical (i.e., full or none) membership? One option is to posit a potentially infinite number of categorical cuts along a scale relevant for phonology. This approach, however, would be computationally rather inefficient, and it ignores the gradient or scalar nature of many category types. 1 Alternatively, we can allow gradience in the category structures that the phonological grammar operates over. Scaling and gradience have been shown in recent literature to be necessary in many parts of phonological grammar and representation. Notably, gradient symbolic activations have been used to capture phonological elements that have blended representations, where a segment can be co-activated for conflicting feature representations (Smolensky et al. 2014;Smolensky & Goldrick 2016). The proposal here is that category membership can also have gradient symbolic activations 2 ; the harmony calculation from (2) has been updated in (7) accordingly.
Each input, then, is associated with a gradient activation for every category. Weighted category membership is not new to maximum entropy models outside of linguistics: multiple membership multilevel models use a similar structure to model mixed and multiple membership of individuals in groups (e.g., Browne et al. 2001), and are particularly useful in social network membership modeling (e.g., Tranmer et al. 2014). An example tableau is given in (8). Input 1 has a category activation of 0.9, whereas Input 2 has a category activation of 0.2. The same Candidate A, then, which violates category-indexed constraint ℂ , will have a different scores for the two inputs: Candidate A's violation will be multiplied by 0.9 for Input 1 and by 0.2 for Input 2. Candidate A will consequently be a more harmonic candidate for Input 2 than for Input 1.

3.1
Revisiting the toy illustration: Male versus female names As discussed in §2.1, male and female names exhibit phonotactic differences. However, gender categories are much more fluid, particularly in the 1990s, with the rise of unisex or gender-neutral names. In order to compare a "unisex" set to the most frequent male and female names, 200 of the most frequent names used for either male or female genders no more than 69% of the time were taken from the same 1990-1999 Social Security Administration dataset. The behaviour of these unisex names is as predicted: they pattern between male and female name phonotactic preferences. As shown in Figure 3, unisex names avoid final stop obstruents less often than female names, but still more often than male names (compare, for example, Taylor to Elaine and Albert).   Figure 4 shows that unisex names are more likely to have initial stress than female names, but less likely than male names.
These patterns can be modeled in the mixed membership MaxEnt HG by specifying a blended activation for unisex names. A demonstrative, hand-weighted tableau is given in (9).
In (9), a male name input has an activation of 1 of the male category (♂= 1), while a female name input has an activation of 1 in the female category. A unisex name, which is used for either male or female (as specified by the Social Security Administration data, which is, to date, gender-binary), has a blended activation for both categories. Thus, the lexically-indexed constraints for both categories apply and the optimal candidate is one that has blended features-in this illustration, a name that is trochaic and avoids name-final stop obstruents. 3

Conclusion
Do we need such gradient category membership outside of sound symbolic patterns? Categories in "core" phenomena of lexically-conditioned phonology also exhibit similar gradient behaviours. For example, auxiliary verbs such as can, could, might, must in English often have duality in their phonological behaviours: they act like content words in hosting greater phonotactic contrasts, but they also act like function words in their propensities to reduce (compared to full verbs) in running speech. To deal with cases like this, previous research has often posited that the relevant categories are more than just "content" and "function" classes of words, resulting in anywhere from 4 to 10 categories along the content to function spectrum (e.g., Altenberg 1987;Hirschberg 1993;Shih 2014;Anttila 2017). However, under a gradient category membership approach, a binary category specification can be salvaged. The abstraction of form as separate from meaning or concept has been foundational to modern linguistic study, as codified by the assumption of the arbitrariness of the sign. One consequence of the assumed division between form and meaning in linguistic study has been the significant overlooking of "non-arbitrary" data such as sound symbolic phenomena in the realm of formal theory. In the minority is work that maintains that non-arbitrary patterns can be and should be captured using formal phonological models. At the very least, much of this work still treats the question of whether non-arbitrary patterns should be in the grammar as still open and unsettled (e.g., on modeling palatalization in formal phonological grammars: ). This paper offers another argument that sound symbolic patterns, in spite of their nonarbitrary roots, are not as extraordinary from "core" phonological patterns as traditionally believed: sound symbolic phenomena can parallel the phonological patterns that our "core" phonological grammars already capture, particularly in how we deal with lexically-conditioned phonology.