The cat stalked ? wilily around the house : Morphological dissimilation in deadjectival adverbs

The adverbial suffix -ly1 and the adjectival suffix -ly2 typically do not combine (e.g., *ghost+ly2+ly1; ‘in a ghostlike manner’). However, phonologically similar strings are attested when one /li/ string is part of the word stem (jollily, compared to: ?smellily, *lovelily). Does morphological structure modulate the acceptability of these words independently from the impact of phonological or usage-based constraints? In two experiments, jolly-type stems are rated more acceptable than smelland love-type stems, which did not significantly differ from each other. A combination of phonological constraints and increased morphological complexity can account for the observed pattern.


Introduction.
There are lexical gaps in English that appear to occur due to some property of the otherwise productive morphemes that would necessarily combine to form such a word.For instance, the phrase "in a timely manner" should in principle be possible to render as an adverb by adding the adverbial suffix -ly, but the resulting word (timely+ly → "timelily") is widely unattested (except as, perhaps, a speech error or as metalinguistic humor).Why might this be?This paper sets out to describe what kind of constraints or biases might be at play in the productivity and acceptability of such words.These types of de-adjectival adverbs comprise a very small set of lexical items in English and are relatively infrequent in usage.For the purposes of this study, adverbs of any derivation which have a surface form containing "-lily" /lɪli/ will herein be called -lily words.We will describe in detail three subtypes of -lily words, illustrated in more detail in Table 2: STEM (e.g., jolly+ly, in which the first 'ly' is entirely within the stem), STEMY (e.g., smelly+ly, in which the first 'ly' is split between the stem and a deadjectival suffix -y), and STEMLY (e.g., lovely+ly, in which the first 'ly' is entirely a deadjectival morpheme).
Based on the apparent relative unacceptability and infrequency of words like "timelily", we set out to describe what types of constraints or biases might be at play regarding the acceptability and/or productivity of -lily words.We begin by establishing the usage patterns of -lily words through a brief corpus analysis.We then describe four hypotheses and evaluate them using the results of a sentence acceptability task.We then conclude by proposing how future approaches to our research question would enhance our present results.

Corpus analysis.
In order to confirm the distribution and frequency of -lily words and to build representative stimuli, we examined the Corpus of Contemporary American English (COCA; Davies 2008-).At the time of access, COCA contained over 570 million words from a variety of spoken and written sources.Words with the -lily pattern were identified and looked up in three forms: the -lily word (adverbial) form, the adjectival form, and the root form when the adjectival form differed from the root (i.e., for STEMY and STEMLY words).For each adjectival form of a -lily word, a near-synonym adjective with a similar frequency was found (e.g., surly / grumpy; smelly / stinky; portly / beefy).From these synonym adjectives, adverbs were created by adding the adverbial -ly suffix in the same manner as the -lily words.The mean and median raw frequencies for all of these forms are listed in Table 1 and all stimuli can be found in the Appendix.Frequency variation between items varied substantially because of the difficulty finding words that matched both the morphological and phonological patterns.This is a priori indicative that the set of relevant words is indeed as small as claimed.Furthermore, generation of unattested forms (i.e., frequencies of 0 in COCA) was unavoidable in stimuli generation because of the small numbers of available words with sufficiently high root frequencies and the marginally grammatical nature of -lily words.Given this, we do find the most attested -lily words to be of the STEM category, which confirms the intuition that these words are more widespread than words from the other two categories.

Hypotheses.
One possibility, which we take as a baseline and will refer to as the phonological hypothesis, is that there is some phonotactic, perceptual, or articulatory constraint against having two adjacent strings of 'ly' or /li/ (Walter 2007 on articulatory gesture).This constraint is violable, but biases speakers against using words that violate the constraint, which thus limits their frequency and spread.One immediately apparent challenge to this hypothesis is the presence of words like "holily" (holy+ly) and "jollily" (jolly+ly), which are not only attested, but are sufficiently attested to be codified in dictionaries, which typically document contemporary and historical language use (e.g., Curzan 2014:103).1This suggests that the phonological properties of the adjacent sequences is not what prevents -lily words from entering common usage.
For the purposes of this study, the phonological hypothesis is not mutually exclusive with our alternate hypotheses.In fact, the combination of the phonological hypothesis with an additional morphologically motivated hypothesis is a crucial component of our analysis.To that end, each of the following hypotheses takes into account the contribution of the phonological form of -lily words to the infelicity of the word overall.We will illustrate the predictions of each hypothesis with three sample lexical items chosen to represent the morphological structure of the words to be tested.These three words are illustrated by the examples in The alternate hypotheses will primarily consider the possibility that the morphological structure of -lily words plays a role in determining how acceptable (and how widespread) they are.This is not entirely straightforward because of the ways in which the phonological hypothesis might interact with morphological structure, so each of the following hypotheses sketches out a particular underlying mechanism that might cause variation across forms.
One such possibility is described by the morpheme boundary hypothesis.This hypothesis stipulates that the presence of a morpheme boundary decreases the acceptability of the word independent of other factors by increasing the time to access (e.g., Stockall & Marantz 2006;Caselli, Caselli & Cohen-Goldberg 2016).Thus, words containing the fewest decomposable morphemes should be the most acceptable of the comparison set.If this is the case, we might account for the attestation of STEM words over STEMY and STEMLY words because STEM words have one morpheme boundary and the others have two, as illustrated in Table 2.In addition to the constraints imposed by the phonological hypothesis, none of the -lily words would be expected to be as acceptable or widespread as (near-)synonyms of similar morphological structure.
(1) Morpheme boundary hypothesis: Any presence of a word boundary decreases acceptability in addition to the difficulty of having two identical phonological sequences.
(STEM) >> (STEMY, STEMLY) (JOLLY + LY) >> (SMELL + Y + LY, LOVE + LY + LY) Alternatively, the phonological form of the morphemes may play a larger role than just the morphological complexity of the word.That is, whichever phonological constraint biases speakers against -lily words applies most strongly to words in which the two adjacent morphemes have identical forms.This identity hypothesis limits the contribution of the phonological hypothesis to -lily words that have the repeated sequence "-ly" at both phonological and morphological levels (e.g., Plag 1998, Yip 1998 on OCP-type constraints; de Lacy 1999 on markedness constraints; Vosberg 2003, Rohdenburg 2003 on avoidance of horror aequi).Thus, it would predict that STEM and STEMY words would not be as unacceptable as STEMLY words, since neither STEM or STEMY words have multiple adjacent morphemes of identical form.Rather, only STEMLY words would be doubly subject to the morphological and phonological constraints, thus STEMLY words should be outliers in terms of their unacceptability.
(2) Identity hypothesis The relevant constraints only apply to morpheme-sized units.Only when two string-identical morphemes are adjacent is the difficulty increased from baseline.
(STEM, STEMY) >> (STEMLY) (JOLLY + LY, SMELL + Y + LY) >> (LOVE + LY + LY) The final hypothesis we will consider is the gradient interaction hypothesis.This hypothesis considers the possibility that there is some measure of string similarity that interacts with morphological structure to produce an acceptability cline (following, e.g., Walter 2007 on perceptual acceptability in speech; Pounder 2004 on individual variation).In particular, this hypothesis predicts that the morphological simplicity of STEM words allows them to be the most acceptable of the three, whereas the combination of morphological complexity and phonologically identical adjacent morphemes of STEMLY words creates the least acceptable forms.This leaves STEMY words at some intermediate level of (un-)acceptability, since they are not doubly affected by biases or constraints against sequences of identical strings at both the phonological and morphological levels, but still contain more morphological complexity than STEM words. (

3) Gradient interaction hypothesis
There is an interaction of phonological constraints and morphological properties such that the decomposability and similarity of adjacent sequences produces gradient effects.Each different type of word is accepted at different rates and to different degrees because the morphological properties differ, but all have decreased acceptability because the phonological constraints apply simultaneously.
The morpheme boundary, identity, and gradient interaction hypotheses cover the three patterns we anticipate as being most likely, although they do not account for every conceivable pattern or explanation.In particular, they rely on the intuition that STEM words are most likely to be acceptable and that STEMLY words are least likely.
(4) The bubbly cheerleaders practiced their routines spiritedly on weekends before big games.
(5) The spirited cheerleaders practiced their routines bubblily on weekends before big games.
2.1 EXPERIMENT 1.In the first experiment, 72 participants were recruited via Amazon Mechanical Turk with IP addresses limited to the United States.Participants were asked to rate the "naturalness" of each sentence in a randomized list on a scale of 1 (unnatural) to 7 (natural).Ratings were untimed.A total of 18 stimuli (6 of each FORM condition) were presented to each participant, counterbalanced between PART OF SPEECH conditions (ADJECTIVAL and ADVERBIAL).Participants also rated sentences from three other sets of stimuli from unrelated and non-conflicting experiments for a total of 90 sentences per participant.One set of these filler items contained marginally grammatical lexical items.Ordinal rating data were analyzed using linear mixed effects regressions with a cumulative link (Christensen 2015, R Core Team 2017).Two independent variables (FORM and PART OF SPEECH) and their interaction was included in the maximal model, along with random effects by subject and item.2To evaluate the contribution of each of the factors and their interaction to the overall fit of the model, the maximal model was compared to depleted models with the relevant term removed.This model comparison revealed a main effect of PART OF SPEECH, with the ADJECTIVE condition rated higher than the ADVERB condition (β=-1.95,SE=0.19,LR(1)=86.1,p<0.0001).No main effect of FORM was detected, although the interaction between FORM and PART OF SPEECH was significant.The interaction seems to be driven by STEM forms being rated higher than the other two forms (STEM-STEMY: β=0.59,SE=0.26;STEM-STEMLY: β=1.14, SE=0.26;STEMY-STEMLY: β=0.56,SE=0.26;LR(2)=18.7,p<0.0001).
From visual inspection of the data, it appears that the reported interaction is at least marginally present in both the ADJECTIVE and ADVERB conditions.Due to the between-items and between-participants nature of a pairwise analysis of the three forms, the statistical power is too low to be reliable.This may be rectified in the future with a different task that allows for a more direct probe of the well-formedness and acceptability of the lexical items, thus will introduce less noise.As it stands, however, we will only speculate as to what the trend indicates.
2.2 EXPERIMENT 2. In the second experiment, 40 participants were recruited via Amazon Mechanical Turk with IP addresses limited to the United States.Participants were asked to rate the "naturalness" of each sentence in a randomized list, then provide a confidence rating for their judgments from 1 (uncertain) to 5 (certain).Ratings were timed, so the rating scales used were reduced to 1 (unnatural) to 5 (natural).A total of 18 stimuli, 6 from each category, were presented to each participant and counterbalanced between ADJECTIVAL and ADVERBIAL conditions.Participants also rated sentences from three other sets of stimuli from unrelated and non-conflicting experiments for a total of 90 sentences per participant.One set of these filler items contained marginally grammatical lexical items.Ordinal rating data were analyzed using the methods described for Experiment 1.3 Similar to Experiment 1, model comparison revealed a main effect of PART OF SPEECH, with the ADJECTIVE condition rated higher than the ADVERB condition (β=-1.56,SE=0.32,LR(1)=19.73,p<0.0001).Again, no main effect of FORM is detected, although the interaction between FORM and PART OF SPEECH was found to be significant.The interaction in this analysis also seems to be driven by STEM forms being rated higher than the other two forms (STEM-STEMY: β=1.04,SE=0.42;STEM-STEMLY: β=1.28,SE=0.42;STEMY-STEMLY: β=0.24,SE=0.40;LR(2)=10.28,p=0.006), effectively replicating Experiment 1.
Since the Likert scale used in this experiment had fewer points than in the previous experiment, we might expect less clear distinction between forms, and to some extent this is what we find.Visual inspection of the data suggests that the significant interaction is not only driven by sentences containing STEM words being rated more highly than the others, but specifically that this is the case in the STEM+ADJECTIVE condition.It is not immediately clear why this should be, since none of the ADJECTIVE condition sentences contained -lily words.However, it is possible that the decreased frequency of any of the adverbs used in the stimuli (whether -lily words or not) contributed to the pattern.Further investigation is required to determine the source of this pattern.
5. General Discussion.Although the tasks in Experiments 1 and 2 employed different scales and different time pressure, the results are consistent across experiments.In both experiments, a main effect of PART OF SPEECH was detected, indicating that sentences containing (adverbial) -lily words were rated lower than those without -lily words.However, this effect might be driven by sentences containing STEM words, which were rated higher than those containing STEMY or STEMLY words.
This evidence supports our claim that a phonology-only approach is insufficient when broaching the topic of dissimilation in -lily words and that morphological structure must play a role.This is similar to that suggested in Nevins' (2012) multiple levels of exponence and their sensitivity to phonological or morphological similarity.This indicates that two of our morphological hypotheses remain candidates for explaining how -lily words are processed.Our results are inconsistent with the identity hypothesis, which predicted that STEMLY words should be rated the lowest of the three forms, and no difference between STEM and STEMY words should be detected.Since the sentences containing STEM words were rated higher than those containing STEMY or STEMLY words, this is consistent with the morpheme boundary hypothesis.The gradient interaction hypothesis cannot be discarded, as it is not inconsistent with our results.In order to distinguish between the morpheme boundary and gradient interaction hypotheses, further investigation is required, employing tasks that are better suited to the research question.

Conclusion.
This study represents the initial steps toward a clearer understanding of why lexical gaps may persist despite productive morphological processes that would otherwise fill those gaps.With the caveat that the methodological tasks employed in this study were not entirely well-suited to the research question, we find evidence that a purely phonological approach cannot capture the pattern of acceptability observed for sentences containing -lily words.Future studies will take a more direct approach to investigate the acceptability and well-formedness oflily words by using such tasks as lexical decision, sentence completion, or further sentence acceptability judgment tasks.
We show here that both phonology and morphology affect the processing of highly similar morphophonological strings, and it is still a contentious question as to how much each aspect affects how or whether we produce and accept words like jollily or lovelily.With the proposed extensions to the current study, we will come closer to determining the extent of influence from phonological dissimilation processes on the formation of lexical gaps.
b) The grumpy bus driver was in a bad mood, so he shouted surlily at his passengers to be quiet.

2.
a) The burly wrestler was surprisingly graceful, even though he was stockily built and always scowling.
b) The stocky wrestler was surprisingly graceful, even though he was burlily built and always scowling.b) The gross and unconventional painting was designed to drip uglily onto the floor below.9.
a) The melancholy teenager could be found wandering gloomily around the park at twilight.
b) The gloomy teenager could be found wandering melancholily around the part at twilight.10. a) The sly thief planned the major heist shrewdly so that there was no evidence left behind.
b) The shrewd thief planned the major heist slyly so that there was no evidence left behind.
STEMY stimuli: 11. a) The smelly stew had been made with old vegetables that wafted stinkily through the kitchen.
b) The stinky stew had been made with old vegetables that wafted smellily through the kitchen.12. a) The chilly morning weather sent wind whipping frostily through everyone's scarves and gloves.
b) The frosty morning weather sent wind whipping chillily through everyone's scarves and gloves.13. a) The prickly bushes stuck out across the path thornily and caught on the clothes of passersby.
b) The thorny bushes stuck out across the path pricklily and caught on the clothes of passersby.14. a) The wobbly tower of blocks was crooked and swayed unbalancedly when anyone walked by.
b) The unbalanced tower of blocks was crooked and swayed wobblily when anyone walked by.
15. a) The oily plate of grilled vegetables glistened greasily in the low light of the restaurant.b) The greasy plate of grilled vegetables glistened oilily in the low light of the restaurant.16. a) The steely stare of the panther is paralyzing to anyone it cold-bloodedly locks eyes with.
b) The cold-blooded stare of the panther is paralyzing to anyone it steelily locks eyes with.17. a) The hilly terrain slowly changed over eons, rising slopingly as the continents shifted.
b) The sloping terrain slowly changed over eons, rising hillily as the continents shifted.18. a) The frilly dresses worn by the bridesmaids bounced gaudily as they walked down the aisle.
b) The gaudy dresses worn by the bridesmaids bounced frillily as they walked down the aisle.19.a) The woolly lambs were shorn so that their spring coats could grow in fleecily and full.
b) The fleecy lambs were shorn so that their spring coats could grow in woollily and full.20.a) The bubbly cheerleaders practiced their routines spiritedly on weekends before big games.
b) The spirited cheerleaders practiced their routines bubblily on weekends before big games.
STEMLY stimuli: 21. a) The lively conversation in the pub carried on animatedly until last call was announced.
b) The animated conversation in the pub carried on livelily until last call was announced.22. a) The friendly neighbor baked apple pies lovingly for everyone on the block last week.
b) The loving neighbor baked apple pies friendlily for everyone on the block last week 23.a) The portly actor sauntered around the stage beefily to exaggerate his own size.
b) The beefy actor sauntered around the stage portlily to exaggerate his own size.24.a) The orderly accountant straightened everything on her desk tidily before going home.
b) The tidy accountant straightened everything on her desk orderlily before going home.25. a) The cowardly rabbit peeked out of its hiding spot timidly whenever it heard a noise.
b) The timid rabbit peeked out of its hiding spot cowardlily whenever it heard a noise.26.a) The motherly babysitter hugged the children affectionately before sending them to bed.
b) The affectionate babysitter hugged the children motherlily before sending them to bed.27.a) The ghostly whispers echoed throughout the house spookily as the film crew started to worry.
b) The spooky whispers echoed throughout the house ghostlily as the film crew started to worry.28.a) The lonely cabin that had been abandoned for years stood isolatedly in the dark forest.
b) The isolated cabin that had been abandoned for years stood lonelily in the dark forest.29.a) The heavenly sound of the choir rung out celestially in the large, gleaming church.
b) The celestial sound of the choir rung out heavenlily in the large, gleaming church.30.a) The scholarly librarian was trying to educatedly argue her point without raising her voice.
b) The educated librarian was trying to scholarlily argue her point without raising her voice.
jolly shopkeeper loved the community and would chat joyously with anyone who stopped by the shop.b) The joyous shopkeeper loved the community and would chat jollily with anyone who stopped by the shop.4. a) The holy ground around the ancient temple was treated sacredly by the locals.b) The sacred ground around the ancient temple was treated holily by the locals.5. a) The silly child played a game where he ran around the playground foolishly with his eyes closed.b) The foolish child played a game where he ran around the playground sillily with his eyes closed.6. a) The early start to her workday let the data analyst get to work freshly on problems from the night b) The fresh start to her workday let the data analyst get to work earlily on problems from the night 7. a) The ghastly scene of the triple murder made onlookers gasp horridly and turn away.b) The horrid scene of the triple murder made onlookers gasp ghastlily and turn away.8.a) The ugly and unconventional painting was designed to drip grossly onto the floor below.

Table 1 :
Mean (and median)frequencies for all target and synonym lexical items.

Table 2 :
Guide for structure of the three types of -lily words referred to in this study.