Proportions vs. cardinalities: Comparative ambiguities and the COVID pandemic

. This paper reports two psycholinguistic experiments on quantity compara-tives and superlatives that are potentially ambiguous between cardinal and proportional readings. By using statements about COVID cases and vaccination numbers as a naturalistic context with real-world relevance, this work furthers our understanding of what happens in linguistic environments where multiple measure functions are available – what modulates the choice between them? The results provide new evidence that comparatives and superlatives can refer to scales ranging over degrees of proportion, in addition to degrees of cardinality. Furthermore, this experimental evidence points to a preference for cardinal interpretations, but also shows that this preference is not rigid and can be weakened in favor of proportional readings by semantic factors – including considerations potentially related to stage-vs. individual-level differences – and by certain linguistic forms.

Though both cardinal and proportional readings have been argued to be available, questions remain about how to formally capture the semantics of these readings. Some recent theoretical work has explored the idea of using underspecified measure functions (see e.g. Bale & Schwarz 2020, Bale & Barner 2009, see also Wellwood 2015 and subsequent work). For example, Bale & Schwarz discuss examples involving what they call 'contextual proportionality,' where the measure function is not fully specified by the conventional meaning of the sentence and instead is contextually determined. It's also worth noting that, as Bale & Schwarz point out, the availability of cardinal interpretations can make it hard to detect the existence of proportional interpretations. Indeed, intuitively, cardinal readings are preferred ('easier to get'). This raises questions both about the strength/robustness of this cardinality bias (can it be weakened or even overcome?), and about the factors that might modulate the preference for one reading over the other. I take a closer look at these issues next.
1.1. CARDINALITY BIAS. Broadly speaking, prior work seems to assume that cardinal readings are preferred over proportional readings (e.g. Solt 2018, Bale & Schwarz 2020, but -to the best of my knowledge -this has not been systematically experimentally tested. Thus, this paper reports two studies that aim to assess, by means of two psycholinguistic studies, whether one reading is preferred over the other, in contexts where both cardinal and proportional are contextually available. From a processing point-of-view, it seems that a dispreference for proportional readings would not be unexpected, due to their greater complexity: Proportional readings depend on both the numerator and the denominator, whereas cardinal readers essentially only make reference to the numerator. This leads us to expect a cardinality bias, a preference to interpret ambiguous comparatives as comparing cardinalities, not proportions (other things being equal).

INDIVIDUAL VS. STAGE-LEVEL PREDICATES.
In addition to the cardinality bias, prior work suggests that cardinal vs. proportional readings (at least with certain quantifiers) are constrained by semantic considerations about predicate type (e.g. Partee 1989, Huettner 1984, Solt 2018, see also Milsark 1977, in particular the distinction between stage-level predicates (describing transient properties, e.g. is feverish) vs. individual-level predicates (describing more permanent properties, e.g. has a college degree, Carlson 1977). More specifically, it has been noted that, when combined with individual-level predicates, many and few seem to only allow (or perhaps very strongly prefer) the proportional reading, not the cardinal reading. Although prior work does not specifically discuss comparatives or superlatives of the specific types tested in the experiments in this paper, if these observations are more broadly relevant, they lead us to expect that individual-level predicates might favor proportional interpretation more than stage-level predicates. In other words, the hypothesized cardinality bias might be weakened when we are comparing individual-level properties.
The stage-level vs. individual-level distinction corresponds well to two kinds of statements that were highly prevalent during the height of the COVID pandemic, namely statements about COVID cases/infections and statements about vaccination numbers. While having COVID can be regarded as a non-permanent, transient, stage-level property, being vaccinated is more indi-vidual-level, permanent property. 1 Indeed, the prediction that individual-level predicates favor proportional interpretations more than stage-level predicates seems to match how information about COVID cases and vaccination numbers was communicated during the height of the pandemic (at least in the U.S.): Impressionistically, it seems that COVID cases were typically reported in the news and in public health communication using raw numbers (cardinal) or per 100,000 (proportional), whereas vaccination numbers appeared to be mostly reported using percentages (proportional). 2 Thus, the second aim of the present work is to explore the link between stage-vs. individual-level predicates and the likelihood of cardinal vs. proportional readings through the lens of pandemic-related comparatives.
This difference in how COVID case numbers and vaccination numbers are conceptualized is also revealed by how these two types of information are graphically reported in (U.S.) public health communications: While COVID cases are often graphed in a non-cumulative way (showing temporary spikes in case numbers during surges/waves), vaccination numbers tend to be graphed in a cumulative manner (as gradually increasing curves). This indicates, again, that we are more likely to regard having COVID as a more temporary state (stage-level) and being vaccinated as a more permanent property (individual-level).
Having said this, it is important to acknowledge that other factors beyond the stage-vs. individual-level distinction may also turn out to be at play, as discussed below. Thus, the studies here represent an initial step that should be complemented by further work.
1.3. PREDICTIONS. The two experiments reported here explore two (non-mutually exclusive) predictions: First, the cardinality bias, which predicts that cardinality readings are generally preferred over proportional readings in quantity comparatives (Experiment 1) and in superlatives (Experiment 2). Second, the experiments also test the semantic factors hypothesis, according to which the availability of cardinality and proportional interpretations can be modulated by semantic factors (in ways that may be related to the individual-/stage-level distinction). To test these predictions, I used sentences about (i) COVID cases/infections (stage-level), and about (ii) vaccinations/vaccinated people (more individual-level).

THE COVID PANDEMIC AS A NATURAL EXPERIMENT.
To investigate whether ambiguous comparatives and superlatives about COVID cases and vaccinated people exhibit a preference for cardinal readings over proportional readings, and to see whether this is modulated by semantic factors (transient vs. more stable properties), two experiments were conducted. The COVID pandemic provides a natural context for investigating the cardinal-proportional ambiguity, because it is a distinction that becomes very relevant when talking about COVID infections or vaccinations. This is illustrated by (3), a (simplified version of a) naturally-occurring example from the internet, where Person A is presumably assuming a cardinal interpretation and doubts its veracity, and B seems to be trying to explain that the relevant reading is the proportional one. Thus, the pandemic provides a meaningful, naturalistic context for experiments on this topic.

(3) Confusion between cardinal vs. proportional (simplified from www)
Person A: Alaska has more COVID than California…riiiight. To test the interpretation of comparatives, eight pairs of COVID information dashboards were created, mimicking the dashboards used in the U.S. to report COVID cases and vaccination rates during the height of the pandemic. Each dashboard provided information for an imaginary U.S. county. Each dashboard reported the raw number of new COVID cases (cardinal information), the per 100,000 COVID case rate (proportional information), the raw number of fully vaccinated people (cardinal information), and the percent of fully vaccinated people (proportional information). Thus, both proportional and cardinal information are saliently available on the dashboards. This means that any differences in people's willingness to consider cardinal vs. proportional readings cannot be straightforwardly attributed to the proportional reading being very 'low salience' or hard to access, given that it is explicitly depicted on the dashboards. The dashboards also mentioned the number of COVID tests done and the positivity rate, but this information was blurred out as it was not relevant for this study. (The study was designed and implemented before younger children were eligible for vaccination in the U.S. and thus the information about vaccination rates excludes children under age 12.) The dashboards, which were presented to participants in pairs as shown in Figure 1, were designed so that one county had higher raw numbers of COVID cases (or higher raw numbers of vaccinated people), and the other one had a higher proportional COVID rate (or higher percent-age of vaccinated people). Thus, the cardinal vs. proportional readings are truth-conditionally distinct. (The magnitude of the differences in COVID cases and vaccination numbers between the two counties varies across items).
On each screen, below the pair of dashboards, participants saw a critical sentence with two blanks where the names of the counties should be. Their task was to type the missing information into the textbox below the sentence. A second textbox (not shown above) was also included, asking participants to explain what information they used to fill in the blanks. This was done to keep participants on task and to provide us with an additional means of checking participants' responses if their answers were unclear.
Due to the truth-conditionally distinct design, participants' responses reveal whether they opted for a cardinal or a proportional interpretation. For example, for the display and the sentence in Figure 1 (There are more COVID cases in ___ than ___ ), putting Teakley County in the first blank and Hawkton County in the second blank indicates a cardinal interpretation, whereas putting Hawkton County in the first blank and Teakley County in the second blank indicates a proportional interpretation.
The study tested eight different linguistic frames, four involving COVID cases and four involving vaccination. Four of these function as control conditions, because the lexical semantics of the sentence (e.g. use of phrases such as number of COVID cases vs. rate of COVID cases) signal what degrees the scale ranges over, i.e., provides information about the intended measure function. These are shown in (4) for the cardinal controls and (5) for the proportional controls.
(4) Cardinal control The number of COVID cases is higher in ___ than in ___ The number of fully vaccinated people is higher in ___ than in ___ (5) Proportional control The rate of COVID cases is higher in ___ than ___ The vaccination rate is higher in ___ than in ___ I also tested the more ambiguous structures in (6), where there are no lexical items indicating what degrees the scale ranges over. I refer to these as the more + noun conditions. It's worth noting that both cardinal and proportional readings are contextually available; both kinds of numbers are on the dashboards. Thus, we can ask, in the absence of lexical semantic cues, which interpretation do people prefer, and does this differ for COVID cases vs. vaccinated people?
There are more COVID cases in ___ than ___ There are more fully vaccinated people in ___ than ___ Finally, two exploratory conditions were also included. Example (7) shows the truncated noun and the truncated adjective conditions. In these conditions, the comparative only mentions more followed by the bare noun COVID or the adjective vaccinated, without mentioning 'cases' or 'people.' Thus, the link between linguistic form and the measure function is even less clear.

(7) Truncated Noun and Adjective conditions:
___ has more COVID than ___ [truncated with noun] ___ is more vaccinated than ___ [truncated with adjective] Before continuing, it's important to address a potential concern regarding the acceptability of these truncated structures. While they may sound marked to some speakers, our corpus inves- tigations indicate that they occur in natural use, in both formal/official and more informal contexts. Some examples from the internet are provided in (8-9).
(8) a. Canada is more vaccinated than the UK.
b. the teaching staff at Woodstock School District 200 also is more vaccinated than its other workers (9) a. Lubbock Now Has More COVID Than Colorado's Largest City b. State of NY has more covid than my town. d. But either way, America has more Covid than any of the countries he put travel bans on, so how well did they work? d. "Remember people were making fun of America, that we didn't do it right, we had more COVID than everybody else," he said.
In the truncated conditions (7), the thing being compared (cases or people) is not explicitly mentioned, and furthermore, due to the structure of the sentence, the placename acts as a proxy to refer to its inhabitants (for further discussion of meaning transfer, see e.g. Nunberg 1995). A place does not get COVID or get vaccinated, its inhabitants do. Despite this similarity, the truncated conditions with nouns and adjectives have different properties: The truncated conditions with nouns (__ has more COVID than __) could possibly be construed as covert plurals with an elided plural noun (__ has more COVID cases than __), or perhaps with COVID construed as a mass noun, akin to 'X has more water than Y'. In both cases, there is no explicit reference to individual cases, and thus we might expect the cardinal reading to be relatively less available than in the more + noun conditions in (6).
When we consider the truncated conditions with adjectives (__ is more vaccinated than __), it's worth noting that the comparatives with placenames-standing-in-for-people differ from 'regular' adjectival comparatives where we are comparing individual people (10-11). When we compare two individual people to each other using a truncated adjective comparative (10a), we get a reading where the two people differ in terms or their degree of vaccination (e.g. Kate has only had one shot of a two-shot vaccine and Lisa has had both shots, or Kate has not yet received a booster shot but Lisa has). Unsurprisingly, use of 'fully' is infelicitous in this case (10b).
(10) Adjectival comparatives with names a. Lisa is more vaccinated than Kate. b. Lisa is more (*fully) vaccinated than Kate (11) Adjectival comparatives with placenames a. California is more vaccinated than Alaska. b. California is more ( OK fully) vaccinated than Alaska.
However, when we are dealing with two placenames in a truncated adjective comparative (11), and the placenames stand in for their inhabitants (i.e., a plurality), we now instead get cardinal and proportional readings with sentences like (11a) and use of 'fully' is felicitous (since we are no longer comparing degrees of vaccination) (11b). This contrast between (10) and (11) suggests that, semantically, in the truncated adjective comparatives we are comparing two pluralities, which is indeed what we expect if the placename acts as a proxy for referring to the inhabitants of that place. However, this is not explicitly reflected in the sentence (the placenames are morphologically singular) and in fact there is no noun mentioned at all in the truncated adjective comparatives. Thus, intuitively, this condition is perhaps the one where accessing a cardinality scale ranging over numbers (of vaccinated people) might be the 'hardest.' If this line of thinking is on the right track, we expect the cardinal construal to be less likely in the truncated adjective condition than in the more + noun comparatives about vaccination (6), because the latter explicitly mention 'people.' In sum, properties of the (admittedly exploratory) truncated conditions in general lead us to expect a weakening of the cardinality bias, which might be clearest in the truncated adjective condition. More broadly, comparing the four different linguistic frames can provide insights into how linguistic packaging interacts with the predicted cardinality bias and predicate type effects.
2.3. DIFFERENCES BETWEEN THE COVID CONDITIONS AND VACCINATION CONDITIONS. It's important to acknowledge that the sentences comparing COVID cases and vaccinated people differ not only in terms of their stage-vs. individual-level properties, but also in other ways. Some of these are shown in Table 1.  The nominal itself is different in the two types, cases with COVID comparatives and people with vaccine comparatives. I suggest that the noun cases, but not the noun people, is what Krifka (1990) calls a phase noun. Phase nouns differ from non-phase nouns in allowing doublecounting. Krifka shows by this by contrasting the nouns passengers and persons. Consider a sentence like 'Two million passengers/persons passed through this airport in the last year.' With persons, only an object-related reading is available (i.e., two million different individuals), whereas passengers also allows for an event-related reading (i.e., there were two million events of individuals passing through the airport, but this might only involve one million different people if everyone flew twice). I suggest that cases vs. people shows the same asymmetry. I do not provide a detailed discussion here, but this distinction might facilitate a cardinal measure function with the phase noun cases, as compared to the non-phase noun people.

COVID cases
Another distinction concerns the denominator relevant for the proportional reading. COVID cases, when conceptualized proportionally, are typically reported out of 100,000 (at least in the U.S.), whereas vaccination numbers, when conceptualized proportionally, are typically reported as a percentage of the (relevant) population. These same denominators were used in the studies reported here (as can be seen in the dashboards in Figure 1), in order to make the stimuli maximally naturalistic. However, it could be argued that conceptualizing proportional information in terms of percentages is more natural and 'easier,' and thus the proportional reading may be more salient than conceptualizing proportional information in terms of the arguably less widely-used 'per 100,000' metric. Thus, the proportional reading may seem more artificial and thus less accessible with COVID cases per 100,000 than with the percentage of fully vaccinated people.
Ultimately, there are multiple reasons, not just the stage vs. individual level distinction, why comparatives involving COVID cases may be less likely to elicit proportional readings (and conversely, more likely to elicit cardinal readings) than comparatives involving vaccinated people.
Admittedly, comparing two things differing along multiple dimensions is complicated and raises the question of why would one opt for this pairing. I opted to test comparatives about COVID cases vs. vaccinated people for two reasons. First, I wanted to test how people interpret comparatives about information that was highly relevant during the height of the pandemicwhich led us to look at infections and vaccination -and that was expressed in a natural/typical way. Second, as a first step, I wanted to see whether we can detect any semantic effects at alleven if we do not know what precise factor is responsible, can we detect asymmetries in the likelihood of cardinal vs. proportional construals for COVID cases vs. vaccinated people?
In other words, my aim is to see whether we can detect any semantic effects on people's interpretations in contexts where multiple measure functions are in principle available. This information can provide a foundation for future work seeking to systematically disentangle the different ways in which these two domains differ, to see which differences are ultimately responsible for the likelihood of opting for cardinal vs. proportional readings. Thus, this work is best regarded as a first step testing whether this conglomeration of semantic factors matters. If we find evidence that comparatives about COVID cases are interpreted differently than comparatives about vaccination numbers, this discovery would pave the way for a subsequent, more systematic comparison (perhaps in a non-COVID domain) of how factors such as stage vs. individual level, phase nouns vs. non-phase nouns and different denominators contribute.
2.4. PROCEDURE. In Experiment 1, participants saw displays like Figure 1 and typed into the textbox the names of the counties in the order that they should go in the blanks. The study was untimed, i.e. participants were not under time pressure. The dashboards and the critical sentence were shown on the same screen so there was no memory load.
2.5. RESULTS. The results for Experiment 1 are shown in Figure 2. As expected, the control conditions mentioning rate elicit a predominance of proportional interpretations (mostly light grey in the top two bars) and the control conditions mentioning number elicit a predominance of cardinal interpretations (mostly dark grey in the next two bars). Furthermore, we also find a difference between COVID cases vs. vaccinated people: There are more cardinal responses with comparatives about 'COVID cases' than with 'vaccinated people' (p's < 0.05), even in these largely unambiguous control conditions. This provides support for the semantic factors hypothesis.
Turning to the more + noun conditions (fifth and sixth bars from the top), participants mostly provided cardinal responses (mostly dark grey), corroborating the predicted cardinality bias. Furthermore, we again find a semantic effect, with more cardinal responses with 'COVID cases' than 'vaccinated people' (p's < 0.05).
Finally, looking at the truncated conditions (two bottom bars), it's clear that while the truncated noun condition with 'COVID' still exhibits a cardinality bias, such a bias is no longer present in the truncated adjective condition with 'vaccinated' which elicits more proportional than cardinal responses. This fits with our hunch that this might be the condition where the accessing a cardinality scale might be the hardest because there is no explicit mention of a plurality. Furthermore, the truncated noun 'COVID' condition (eighth bar) elicits fewer cardinal responses than the more + COVID conditions (sixth bar), even though both are about COVID. In other words, the cardinal bias we saw in the more + COVID conditions is weakened in the truncated versions where no explicit reference is made to cases. 2.6. DISCUSSION. Overall, these results provide clear support for the hypothesized cardinality bias and the semantic factors hypothesis which posits that the strength of the cardinality bias can be modulated by factors related to the target of comparison, with comparisons involving COVID cases eliciting more cardinal interpretations than comparisons involving vaccinated people. Furthermore, the results show that linguistic form also plays a role. In the more + noun conditions, which lack explicit lexical semantic information signaling the measure function, participants are less likely to opt for proportional readings than in the proportion control conditions that explicitly signal, by means of the word rate, that we are dealing with a measure function ranging over proportions. In addition, when no explicit reference is made to cases or to people, in the exploratory truncated noun and adjective conditions, the cardinality bias is weakened even more. This shows that linguistic packaging plays a key role in modulating the ease of accessing, or perhaps the salience of, proportional measure functions relative to default cardinal measure functions.
3. Experiment 2: Superlatives. The second experiment tests whether the effects observed in Experiment 1 extend to superlatives, arguably a cognitively more complex situation because three different counties are being compared. Based on semantic theory, we expect the results from Experiment 1 to replicate.
3.1. PARTICIPANTS. Data for 129 adult native U.S. English speakers, recruited via Amazon MTurk, is reported for Experiment 2. None had participated in Experiment 1. Again, participation took place remotely over the internet.
3.2. DESIGN AND MATERIALS. The design and materials were the same as for Experiment 1, except that now participants saw three dashboards on each screen and filled in single blanks in superlative sentences of the forms shown in (12-15). The different sentence types (cardinal control, proportional control, most + noun, and truncated noun and truncated adjective) were kept as similar to Experiment 1 as possible, but adapted into superlatives.
(12) Cardinal control ___ has the highest number of COVID cases ___ has the highest number of fully vaccinated people As in Experiment 1, some readers may have questions about the naturalness of the truncated forms. Based on our corpus investigations, they occur in naturalistic texts. Examples from the internet are in (16-17).
(16) a. The figure below illustrates that among Midwestern counties the well "wired" urban centers have the most COVID-19, with "unwired" rural hinterlands and smaller com munities relatively unscathed. b. The richest and one of the most modern countries in the world has the most covid and it is increasing exponentially c. Also, isn't it coincidental that the most tested populations have the most COVID, and those unvaccinated untesting countries have almost no cases? (17) a. Out of all the zip codes in Douglas County, 68007, is the most vaccinated.
b. She said the older population is the most vaccinated. 3.3. PROCEDURE. The procedure was the same for Experiment 1, except that participants now saw three dashboards per screen and saw superlative sentences with only one blank (12-15).
3.4. PREDICTIONS. The predictions are the same as for Experiment 1, assuming that the effects extend to a more cognitively complex context that requires comparing three elements.
3.5. RESULTS AND DISCUSSION. The Experiment 2 results are in Figure 3, which shows that the key results from Experiment 1 replicate with superlatives (though note that the rate of cardinal responses in the proportional controls mentioning COVID cases, in the second bar from the top, is essentially at chance in Experiment 2; importantly, this is still much lower than the rate of cardinal responses in the cardinal controls and thus largely fits with what we've seen so far.) 4. General discussion. This paper reports two psycholinguistic studies on quantity comparatives and superlatives that are potentially ambiguous between cardinal and proportional readings, with the aim of improving our understanding of what happens in linguistic environments where multiple measure functions are available. The studies test how likely participants are to opt for a cardinal or a proportional measure function, and what factors modulate these preferences. The COVID-19 pandemic provides a naturalistic context for exploring these kinds of questions, because the cardinal vs. proportional distinction is very important when considering information about COVID cases and vaccination rates. For example, if someone says that small town A with less than ten thousand inhabitants has more COVID than big city B with a population of over a million, it's very important to know whether we are dealing with a cardinal measure function (it would be a very bad sign for small town A to have a higher absolute number of COVID cases than big city B) or a proportional measure function (proportionally, it would not be inconceivable for small town A to have a relatively higher rate of COVID cases than big city B). Moreover, comparatives and superlatives about COVID cases and about vaccinated people differ semantically in several theoretically interesting ways, allowing us to start to investigate how much these properties guide people's interpretations, when coupled with different linguistic forms.
The results provide new evidence that, in certain contexts, comparatives and superlatives can refer to scales ranging over degrees of proportion, in addition to degrees of cardinality. Furthermore, the experimental data points to a cardinality bias (a preference for cardinal interpretations), but also shows that this bias can be weakened in favor of proportional readings by semantic factors and by certain linguistic forms. More specifically, overall, there are more cardinal interpretations with comparatives and superlatives about COVID cases than with those about vaccinated people, which may be related to the former being a stage-level property and the latter being an individual-level property. These patterns are also modulated by lexical semantics (e.g. words such as rate that refer to proportions), and by whether the sentence includes an explicit mention of cases/people, which seems to make cardinal readings more available.
Many questions remain open for future work. In particular, a more systematic disentangling of different semantic factors is necessary in order to understand what aspects of the differences between statements about COVID cases and statements about vaccinated people is responsible for the asymmetries revealed by the experiments reported here. The current work contributes a necessary foundation by providing novel evidence that these two kinds of statements clearly do differ in terms of whether they tend to receive cardinal or proportional readings, and suggests some possible explanations for this, but future work is needed to tease apart which factor(s) are driving these effects.