Listeners use descriptive contrast to disambiguate novel referents and make inferences about novel categories

In the face of unfamiliar language or objects, description is one cue people can use to learn about both. Beyond narrowing potential referents to those that match a descriptor, listeners could infer that a described object is one that contrasts with other relevant objects of the same type (e.g., “the tall cup” contrasts with another, shorter cup). This contrast may be in relation to other objects present in the environment or to the referent’s category. In two experiments, we investigate whether listeners use descriptive contrast to resolve reference and make inferences about novel referents’ categories. While participants use size adjectives contrastively to guide novel referent choice, they do not reliably do so using color adjectives (Experiment 1). Their contrastive inferences go beyond the current referential context: participants use description to infer that a novel object is atypical of its category (Experiment 2). Overall, people are able to use descriptive contrast to resolve reference and make inferences about a novel object’s category, allowing them to infer new word meanings and learn about new categories’ feature distributions.


Introduction.
Suppose a friend asked you to "Pass the tall dax." You might look around the room for two similar things that vary in height, and hand the taller one to them. This is how people to respond to adjectives like "tall" with known objects-they preferentially consider candidate referents that have short competitors as soon as they hear "tall" (Sedivy et al., 1999). If, on the other hand, there were no objects that varied only in their size, you might infer something different-most daxes must be shorter than the one your friend wants, since people tend to mention atypical features more than typical ones (Mitchell et al., 2013;Rubio-Fernández, 2016). From the indirect information in your friend's utterance, you could in principle learn either the meaning of a new word, the typical size of a new category, or both. But would you be likely to in practice? In a set of two experiments, we tested whether people use adjectives like "small" and "red" contrastively to determine the meaning of the novel word they heard, whether these adjectives lead people to infer the typical color or size of the described object's category, and whether these two processes interact.
Studies using familiar objects show that people use adjective description contrastively to guide their identification of the referent (Sedivy et al., 1999;Sedivy, 2003). In one such task, four objects appeared on a screen: a target (e.g., a tall cup), a contrastive pair (e.g., a short cup), a competitor that shares the target's feature but not category (e.g., a tall pitcher), and an irrelevant distractor (e.g., a key). Participants then heard a referring expression: "Pick up the tall cup." Participants looked more quickly to the correct object when the utterance referred to an object with a same-category contrastive pair (tall cup vs. short cup) than when it referred to an object without a contrastive pair (e.g., when there was no short cup in the display). These results suggest that listeners expect speakers to use description when they are distinguishing between potential referents of the same type, and use this inference to rapidly allocate their attention to the target object as an utterance progresses. This principle does not apply equally across adjective types, however: color adjectives seem to hold less contrastive weight (Sedivy, 2003), perhaps because color adjectives are often used redundantly in English (Pechmann, 1989). These experiments demonstrate that listeners use contrast among familiar referents to guide their attention allocation, though not their explicit referent choice, which occurs after the noun disambiguates the object.
Beyond contrasting a referent with other objects in the present environment, description may draw a contrast between a referent and its category. In production studies, participants tend to describe atypical features more than they describe typical ones (Mitchell, Reiter, & Deemter, 2013;Rubio-Fernández, 2016;Westerbeek, Koolen, & Maes, 2015). For instance, they almost always include a color descriptor when referring to a blue banana, but not when referring to a yellow one. People therefore use knowledge of contrast with present objects and with an object's category to inform their production and comprehension of adjectives. Can they turn this process around, using the principle of descriptive contrast to learn about novel objects and categories in the world?
In this paper, we present a series of experiments to test whether and how listeners make inferences about novel referents using descriptive contrast. First, we examine whether listeners use descriptive contrast to resolve referential ambiguity. In a reference game, participants see groups of novel objects and hear a referring expression asking them to pick one, e.g., "Find the small toma." If participants interpret description contrastively, they should infer that the description was necessary to identify the referent-that the small toma contrasts with some differentlysized toma on the screen. Using this contrastive inference, they can resolve referential ambiguity, choosing a small object with a similar larger counterpart rather than a small object with no similar counterpart nearby. Second, we test whether listeners use descriptive contrast to make inferences about a novel object's category. Participants are presented with two interlocutors who exchange objects using referring expressions, such as "Pass me the blue toma." If participants interpret description as contrasting with an object's category, they should infer that in general, few tomas are blue. Further, these two inferences may trade off: if the objects in the scene necessitate adjective use to identify the referent uniquely (e.g., there are blue and red tomas in the scene), participants may be less likely to attribute the adjective to a contrast with the category.
In order to determine whether people can use contrastive inferences to disambiguate and learn about referents, and how those inferences are affected by adjective type, we use reference games with novel objects. Novel objects provide both a useful experimental tool and an especially interesting testing ground for contrastive inferences. These objects have unknown names and feature distributions, creating the ambiguity that is necessary to test referential disambiguation and category learning. But the ability to disambiguate novel referents, or to establish reference with incomplete information, is also the broader problem of learning about the world. We know that distributional information in the world affects people's pragmatic use and interpretation of description in familiar contexts (Sedivy, 2003;Westerbeek et al., 2015). Here, we ask: can people use pragmatic inferences from description to learn about unfamiliar things in the world?

Experiment 1.
In Experiment 1, we tested whether adult participants use adjective contrast to select an ambiguously mentioned novel referent. In a referential disambiguation task, we presented participants with arrays of novel fruit objects ( Figure 1). On critical trials, participants saw a target object, a lure object that shared the target's contrast feature but not its shape, and a contrastive pair that shared the target's shape but not its contrast feature. Participants heard a referring expression denoting the feature, e.g., "Find the [blue/big] dax." For the target object, use of the adjective was necessary to disambiguate it from the distractor with the same shape but a different color or size; for the lure, the adjective would be superfluous description. If participants use contrastive inference to choose novel referents, they should choose the target object. However, we do not expect listeners to treat color and size equally. Because color is described more superfluously in English than size, we expect size to hold more contrastive weight, encouraging a more consistent contrastive inference.
2.1. METHOD. We recruited 300 participants through Amazon Mechanical Turk. Half of the participants were assigned to a condition in which the critical feature was color (stimuli contrasted on color), and the other half were assigned to a condition in which the critical feature was size.
Stimulus displays were arrays of three novel fruit objects. Fruits were chosen randomly at each trial from 25 fruit kinds. Ten of the 25 fruit drawings were adapted and redrawn from Kanwisher, Woods, Iacoboni, and Mazziotta (1997); we designed the remaining 15 fruit kinds. Each fruit kind has an instance in each of four colors (red, blue, green, or purple) and two sizes (big or small). Particular target colors were assigned randomly at each trial and particular target sizes were counterbalanced across display types. There were two display types: unique target displays and contrastive displays. Unique target displays contained a target object that had a unique shape and was unique on the trial's critical feature (color or size), and two distractor objects that matched each other's (but not the target's) shape and critical feature. Contrastive displays contained a target, its contrastive pair (matched the target's shape but not its critical feature), and a lure (matched the target's critical feature but not its shape). The positions of the target and distractor items were randomized within a triad configuration.
Participants were told they would play a game in which they would search for strange alien fruits. Each participant saw eight trials. Half of the trials were unique target displays and half were contrastive displays. Crossed with display type, half of trials had audio instructions that described the critical feature of the target (e.g., "Find the [blue/big] dax"), and half of trials had audio instructions with no adjective description (e.g., "Find the dax"). Participants clicked on the objects to respond. A name was randomly chosen at each trial from a list of eight nonce names: blicket, wug, toma, gade, sprock, koba, zorp, and lomet.
After completing the study, participants were asked to select which of a set of alien words they had heard previously during the study. Four were words they had heard, and four were nov- Figure 1. Experiment 1 stimuli. On the left: an example of a contrastive trial in which the critical feature is size. Here, the participant would hear the instruction "Find the small dax." On the right: an example of a contrastive trial in which the critical feature feature is color. Here, the participant would hear the instruction "Find the red dax." In both cases, the target is the top object. el lure words. Participants were dropped from further analysis if they did not respond to at least 6 of these 8 correctly (above chance performance as indicated by a one-tailed binomial test at the p = .05 level) or if they missed any of four color perception check trials (resulting n = 163).
2.2. RESULTS. We first confirmed that participants understood the task by analyzing performance on unique target trials, in which the target had no competitors with the same shape or critical feature (color or size). We asked whether participants chose the target more often than expected by chance (33%) by fitting a mixed effects logistic regression with an intercept term, a random effect of subject, and an offset of logit(1/3) to set chance probability to the correct level. The intercept term was reliably different from zero for both color (β = 6.64, t = 4.10, p < 0.001) and size (β = 2.25, t = 6.91, p < 0.001). In addition, participants were more likely to select the target when an adjective was provided in the audio instruction in both conditions. We confirmed this effect statistically by fitting a mixed effects logistic regression predicting target selection from condition, adjective use, and their interaction with random effects of participants. Adjective type (color vs. size) was not statistically related to target choice (β = -0.48, t = -1.10, p = 0.27), and adjective description in the utterance increased target choice (β = 3.85, t = 3.52, p < 0.001). Participants had a general tendency to choose the target in unique target trials, which was strengthened if the audio instruction contained the relevant adjective.
Our key test was whether participants would choose the target object on contrastive trials in which description was given, reflecting use of a contrastive inference to choose a novel referent ( Figure 2). To test this, we compared participants' rate of choosing the target to their rate of choosing the lure, which shares the relevant feature with the target. Across all contrast trials, use of an adjective shifted participants toward choosing the target rather than the lure (β = 2.07, t = 6.24, p < 0.001). When size was specified, participants chose the target significantly more often than the lure (β = 0.86, t = 4.41, p < 0.001). However, when color was specified, participants did not choose the target significantly more often than the lure (β = 0.15, t = 0.75, p = 0.45). Among contrast trials in which an adjective was not given, participants dispreferred the target, instead choosing the lure object which matched the target's feature but had a unique shape (β = -2.65, t = -5.44, p < 0.001). Participants' choice of the target over the lure in the size condition was therefore not due to a prior preference for the target in contrast displays, but relied on contrastive interpretation of the adjective. Figure 2. Experiment 1 contrast trial results. Proportion of times that participants chose the target and lure items as a function of adjective condition (color vs. size) and whether an adjective was provided in the utterance. Error bars indicate 95% confidence intervals.
2.3. DISCUSSION. When faced with unfamiliar objects referred to by unfamiliar names, people must resolve ambiguity to understand their conversational partner and learn more about words and their referents in the world. In Experiment 1, we tested whether people could use contrastive inferences to resolve ambiguous reference to novel objects. We found that participants have a general tendency to choose objects that are unique in shape when reference is ambiguous. However, when people hear an utterance with description (e.g., "blue toma", "small toma"), they shift away from choosing unique objects and toward choosing objects that have a similar contrasting counterpart. Furthermore, use of size adjectives-but not color adjectives-prompts people to choose the target object with a contrasting counterpart significantly more often than the unique lure object. Thus, we found that people are able to use contrastive inferences about size to successfully resolve which unfamiliar object an unfamiliar word refers to.

Experiment 2.
In Experiment 1, we examined whether people would interpret description as implying contrast with other present objects. However, as noted, description can imply contrast with sets other than the set of currently available referents. One of these alternative sets is the referent's category. Work by Mitchell et al. (2013) and Westerbeek et al. (2015) demonstrates that speakers use more description when referring to objects with atypical features (e.g., a yellow tomato) than typical ones (e.g., a red tomato). This selective marking of atypical objects potentially supplies useful information to listeners: they have the opportunity to not only learn about the object at hand, but also about the typical features of its category. In the following experiment, we test whether people use this type of contrast to make inferences about a novel category's feature distribution.
3.1. METHOD. Two hundred and forty participants were recruited from Amazon Mechanical Turk. 120 participants were assigned to a condition in which the critical feature was color (red, blue, purple, or green), and 120 participants were assigned to a condition in which the critical feature was size (small or big). Stimulus displays showed two alien interlocutors, one on the left side (Alien A) and one on the right side (Alien B) of the screen, each with two novel fruit objects beneath them (Figure 3). Alien A, in a speech bubble, asked Alien B for one of its fruits (e.g., "Hey, pass me the big toma"). Alien B replied, "Here you go!" and the referent disappeared from Alien B's side and reappeared on Alien A's side. Figure 3. Experiment 2 stimuli. In the above example, the critical feature is size and the object context is a within-category contrast: the alien on the right has two same-shaped objects that differ in size.
Two factors, presence of the critical adjective in the referring expression and object context, were fully crossed within subjects. Object context had three levels: within-category contrast, between-category contrast, and same feature. In the within-category contrast condition, Alien B possessed the target object and another object of the same shape, but with a different value of the critical feature (color or size). In the between-category contrast condition, Alien B possessed the target object and another object of a different shape, and with a different value of the critical feature. In the same feature condition, Alien B possessed the target object and another object of a different shape but with the same value of the critical feature as the target. Thus, in the withincategory contrast condition, the descriptor is necessary to distinguish the referent; in the between-category contrast condition it is unnecessary but potentially helpful; and in the same feature condition it is unnecessary and unhelpful (see example stimuli in Figure 4). All object contexts equated direct observation of the target object category's feature distribution: participants saw the target object and one other object with the target's shape and a different critical feature value. We manipulated the critical feature type (color or size) between subjects.
Participants performed six trials. After each exchange between the alien interlocutors, they made a judgment about the prevalence of the target's critical feature in the target object's category. For instance, after seeing a red blicket being exchanged, participants would be asked, "On this planet, what percentage of blickets do you think are red?" and answer on a sliding scale between zero and 100. In the size condition, participants were asked, "On this planet, what percentage of blickets do you think are the size shown below?" with an image of the target object they just saw available on the screen.
After completing the study, participants performed the same novel word attention check described in Experiment 1, and were excluded if they did not respond to at least 6 out of 8 words correctly (resulting n = 193). Figure 4. Experiment 2 prevalence judgments. Participants consistently judged the target object as less typical of its category when the referent was described with an adjective (e.g., "Pass me the purple toma") than when it was not (e.g., "Pass me the toma").
3.2. RESULTS. We analyzed participants' judgments of the prevalence of the target object's critical feature in its category (Figure 4). We fit a maximum mixed-effects linear model, including effects of utterance type (adjective or no adjective), context type (within-category contrast, between-category contrast, or same feature), and critical feature (color or size) as well as all interactions between these three factors and random effects of utterance type and context type nested within subject. Random effects were removed until the model converged, resulting in a final model with effects of and interactions between condition, adjective, and context type, and a random effect of utterance type nested within subject. The final model revealed a significant effect of utterance type (βadjective = -11.79, t = -3.90, p < 0.001), such that when an adjective was used, participants inferred that the described feature was less prevalent in object's category. There was no significant effect of critical feature (βsize = 3.36, t = 1.03, p = 0.301), and there was a significant effect of same-feature context relative to within-category contrast context, but no significant effect of between-category relative to within-category contrast contexts (βsame = -5.41, t = -2.25, p = 0.025; βbetween = -3.92, t = -1.62, p = 0.104). There were no significant interactions. That is, participants slightly adjusted their inferences according to the object context, though not in a way that depended on whether an adjective was used in the utterance. However, they robustly inferred that described features were less prevalent in the target's category than unmentioned features.

General Discussion.
Overall, we found that people are able to use descriptive contrast to infer the referent of a novel word and to make inferences about a novel referent's category. In our first experiment, participants were able to resolve referential ambiguity using a contrastive interpretation of size adjectives, though not reliably with color adjectives. In our second experiment, participants inferred that a described referent was atypical of its category on that feature: hearing "big toma" led them to think that most tomas were not that size. In real life it is often unclear whether description is meant to contrast with present objects or imply atypicality. In our second experiment, participants adjusted their typicality judgments slightly based on the object context: when the object context was two objects of different kinds with the same critical feature, they judged the target object as slightly more atypical. However, participants did not significantly adjust their prevalence judgments based on the interaction of adjective use and object contextthat is, they did not adjust their inferences about typicality based on how redundant description was in context. Further, contexts in which description was necessary to identify the referent did not preempt inferences of atypicality. The relative robustness of contrastive inferences about typicality across contexts and adjective types compared to contrastive inferences among present referents raises questions about the relative importance of these two kinds of contrast in language understanding. Most prior work has focused on contrast with present referents as the main phenomenon of interest, with object typicality as a modulating factor; our results emphasize the role of contrast with an object's category, particularly when ambiguity is at play. Future work will explore whether people make subtle trade-offs between contrast with present referents and with the referent's category, and why color-size asymmetries seem to differ between these two types of contrast. Further, use of adjectives has been shown to allow children to make contrastive inferences among familiar present objects (Huang & Snedeker, 2008) and, when paired with contrastive cues such as prosody, about novel object typicality (Horowitz & Frank, 2016); future work will explore whether adjective contrast alone is a viable learning tool in early childhood. Contrastive inferences allow people to learn the meanings of new words and the typical features of new categories, pointing to a broader potential role of pragmatic inference in learning about the world.