Real-time processing of indexical and generic expressions: Insights from, and implications for, COVID-related public health messages

. We used COVID-related health messages to investigate the incremental processing of expressions such as you, we and people in generic contexts, to further our understanding of how these kinds of expressions are processed in real-time and to explore whether the ease of comprehending public health messages related to the COVID pandemic is influenced by type of referring expression. Results from a self-paced reading study point to an increased processing load in messages with the non-indexical form people (relative to we and you ), which we suggest is separable from effects of word length and frequency. We interpret this as preliminary support for the Indexicality Hypothesis, which posits that expressions which can in principle receive indexical interpretations are easier to process than non-indexicals, and also emphasize the need for further work on these kinds of questions.

generic uses of people -is both empirically and theoretically meaningful, given that first-and second-person pronouns are some of the most frequent and communicatively central parts of language, while also being semantically fundamentally different from anaphoric pronouns (e.g. Braun 2001).
We use COVID-related health messages to take initial steps to explore the processing of referential expressions including you, we and people, with the dual aims of (i) contributing to our understanding of how these experimentally under-researched expressions are processed in real time and (ii) exploring whether the ease of comprehending public health messages related to the COVID pandemic (as measured by reading time) is influenced by type of referring expression. We emphasize that this work is best viewed as an initial, preliminary exploration of these kinds of issues, and that many questions remain open. We hope that some of the questions raised by our work can inform future investigations.
In the present work, we focus on generic uses of you, we and the impersonal expression people -in other words, we focus on contexts where these forms are not used to refer to specific individuals but rather to people in general. All of these forms are regularly used in COVID-19 related public health announcements and advice, as exemplified in (1a-c).
(1) a. You should wear a mask, even if you do not feel sick.
(source: https://www.cdc.gov/coronavirus/2019-ncov) b. We should wear masks anywhere where we are unsure of the vaccination status of the people around us.
(source: https://sph.umd.edu/news/mask-or-not-mask, UMD School of Public Health) c. People should wear masks that cover both the mouth and nose… (source: https://stacks.cdc.gov) Before continuing, a brief terminological clarification is in order. In the linguistic literature, the term 'generic' is often used for non-indexical uses of you and we that mean (roughly) 'people in general,' and can include the speaker and the addressee, as in (1a-b). In this paper, we also use the term 'generic' for the impersonal form people in contexts like the one in (1c). By calling these three forms generic, we do not mean to imply that all three expressions are semantically or pragmatically equivalent. Rather, we use the term 'generic' to distinguish uses like those in (1) from clearly indexical uses such as I'm glad you wore your mask yesterday and We bought a tenpack of masks this morning that clearly refer to specific individuals, not people in general.
COVID messages with deontic modality like the examples in (1) have the advantage of allowing for minimal triplets where the communicative function of we/you/people is as similar as possible: in public health communication messages like those in (1), all three forms are likely to be construed non-indexically (i.e., you and we are likely to receive generic construals). In all three cases, the speaker or writer makes the same basic point -namely that according to current rules and/or expectations, masks are to be worn.
This uniformity is not the case with you, we and people across the board. For example, compare We go to Italy in the summer vs. People go to Italy in the summer. The first sentence involves indexical reference, while the second involves generic reference (people means 'people in general').
Thus, public health messages like those in (1) are especially well-suited for comparing the online processing of these three forms, while keeping their communicative contribution as similar as possible, by favoring a non-indexical, generic construal in all cases. Moreover, an understanding of how different referential forms modulate the comprehension ease of health messages can also have broader implications for public health communication.

Prior work.
To the best of our knowledge, earlier work on first and second person pronouns in COVID messages (and health messages more generally) is fairly limited, and yields a divergent set of results. In recent work, Tu et al. (2021) tested COVID stay-at-home appeals with either you or we, such as "stay home, you can get through this / we can get through this together." They found that messages with second-person you are more effective than ones with firstperson plural we when it comes to shaping people's self-reported likelihood of staying at home vs. going to a friend's party in a hypothetical scenario. Kaiser's (2021) work on COVID messages about masks and social distancing testing you, we and people/everyone in subject position found that messages that used expressions like people/everyone were rated more convincing by democrats, while non-democrats showed no clear sensitivity to the type of referring expression.
In non-health-related work, Brunyé et al. (2009) tested short narratives with second-person you, first-person I and third-person pronouns in non-generic, episodic contexts (e.g. using sentences like you are slicing tomatoes, coupled with a picture of hands cutting a tomato). While they found evidence that second-person you can push participants to assume an agentive participant perspective, the results for first-person I were more context-dependent. I elicited an actor perspective in one-sentence contexts, but in longer narrative contexts, I elicited an onlooker perspective. (We do not test first-person I in this paper.) As a whole, these kinds of results suggest that comprehenders are indeed sensitive to distinctions between different types of pronominal expressions, but -partly due to the different research aims of different researchers -existing results do not yet yield a unified picture of how different forms pattern. Further systematic investigation is needed.
Furthermore, these kinds of studies typically use offline rating scales that ask for explicit ratings of persuasiveness (e.g. Kaiser 2021) or the likelihood of different hypothetical or future actions (e.g. Tu et al. 2021). These kinds of studies do not provide a measure of incremental, real-time processing and do not provide insights into whether certain linguistic options are harder to process than others. Given the value of easily-understood health messages (see e.g. the CDC's Health Communication Playbook), identifying differences in processing ease of different expressions has both theoretical and applied relevance. In order to gain insights into processing load, the present work uses a self-paced reading paradigm.

Experiment.
To explore the real-time processing of generic uses of you, we and people in COVID-related health messages, we used self-paced reading to test three hypotheses, rooted in prior work in linguistics and psychology, regarding the processing ease of these forms. We preview those hypotheses here (see Section 5 for details): According to the Indexicality Hypothesis, linguistic expressions that could, in principle, be indexical (i.e., refer directly to individuals salient in the speech situation; speaker and/or addressee, e.g. you, we) are easier to process than non-indexical forms like people, even in contexts where the forms are used non-indexically.
In contrast, the Perspective-taking Hypothesis predicts that the perspective-sensitive nature of potentially indexical forms -the fact that, when used indexically, their referents can switch depending on whose perspective is being assumed -incurs a processing cost, relative to consistently non-perspectival expressions like people which are thus correspondingly easier to process.
Finally, the Genericity Hypothesis posits that the genre/context of public health messagesthe fact that they favor generic construals across the board -renders the directly addressee-and speaker-referring indexical meanings inaccessible, thereby making you and we equivalent to people and leading us to expect no processing differences.

Method
4.1. PARTICIPANTS. Participants, recruited via Prolific, did the study over the internet. A total of 58 participants took part; 10 were excluded from subsequent analyses due to not being native English speakers (6 people), failing a standard attention-check question (2 people) or reporting vision/hearing impairments (2 people). No participants were excluded for poor performance on comprehension questions as all participants met our pre-determined threshold. Thus, 48 native English-speaking, U.S.-based participants were included in the final analysis. 4.2. DESIGN. We manipulated referring expression type (we, you, people) to create three conditions as exemplified in (2a-c). 1 The referring expression was preceded by a short preamble phrase (e.g. on account of the pandemic) and followed by an auxiliary verb, the main verb and the rest of the sentence. The study consisted of 39 targets, with different COVID-19 related recommendations, presented in a within-subjects Latin-Square design. 2 (2) a. On account of the pandemic, you should get the vaccine to prevent further spread of COVID-19, even if you are young and healthy. b. On account of the pandemic, we should get the vaccine to prevent further spread of COVID-19, even if we are young and healthy. c. On account of the pandemic, people should get the vaccine to prevent further spread of COVID-19, even if they are young and healthy.
In U.S. English, the pronoun you is (in principle) ambiguous between an indexical and a generic interpretation: On an indexical interpretation, it refers to the addressee (e.g. Brunyé et al. 2009 on you triggering a participant perspective). On a generic interpretation, you refers to people in general (like 'one'). We is also potentially ambiguous between indexical and generic readings (e.g. Whitley 1978, Kamio 2001, Holmberg & Phimswat 2017. This is illustrated in (3a-b). Kamio (2001) notes that (3a) can be construed generically, with we referring to the 'whole human race.' For (3b), Whitley (1978) notes that we clearly does not refer to the speaker or the addressee, as they will not survive into the twenty-third century. (This generic use of we seems less frequent than the generic use of you, however.) 3 Ultimately, in the context of COVID recommendations of the type in (2), the generic interpretations of you and we seem very natural, even though the forms themselves can be used with either indexical or generic construals in the right contexts. The relevant contrast for our purposes is that -in contrast to you and we -the bare noun people is not an indexical expression and can only receive a people-in-general-type interpretation.
(3) a. We all get older day by day. (Kamio 2001) 1 2 items out of 39 used everyone instead of people, as it was judged to sound more natural for those two items. 2 Some items referred back again to the subject of the sentence, as (2) does in the final clause. However, here we only analyze the reading times for the first occurrence of you, we or people.
3 There exists an extensive and nuanced theoretical literature on various generic and impersonal forms in different languages, as well as their semantic and pragmatic properties. The brief discussion here does not do justice to the extensive research on this topic. In this paper, what is most relevant for us is the 'coarse-grained' observation that while all three forms can receive generic readings, only we and you can, in other contexts, be semantically indexical (refer to speaker and/or addressee); the expression people is not an indexical form. b. Just think! In the twenty-third century we will teleport to Mars in just seconds. (Whitley 1978) 4.3. PROCEDURE. Participation took place remotely over the internet in Spring 2021. The sentences were presented using a word-by-word moving-window self-paced reading paradigm, using PCIbex (Zehr & Schwarz 2018). Participants pressed the space bar to advance through the sentences, and the reading time for each word was measured. Each sentence was followed by a multiple-choice comprehension question to ensure participants paid attention to the task.
4.4. DATA ANALYSIS. To address concerns regarding inattentiveness of web-based participants, reading times below 100ms and over 3000ms were removed (<1% of data), as were trials with consecutive RTs below 100ms (indicating that participants were simply holding down the space bar / pressing repeatedly without reading, 6.25% of trials). Mixed-effect regression models were used to analyze log-transformed RTs at each word position. Models were fitted using the lme4 package (Bates et al. 2015) and lmerTest (Kuznetsova et al. 2017) in the R software environment (R Development Core Team, 2019). We also calculated residual logRTs in order to control for differences in word length. Word length (in characters), word position in the sentence, and item position in the list (as well as participant ID) were included when calculating the residual logRTs. Residual reading times were calculated based on all trials in the experiment.
The referential forms were compared pairwise. As random effects, we included intercepts for subjects and items, and by-subject and by-item random slopes for referential form type when justified by model comparison: Random effects started out fully specified with by-subject and by-item effects of referential form, and were reduced by model comparison. Only random effects that contributed significantly to the model (p<.05) were maintained (Baayen et al. 2008).

Hypotheses.
We explored three hypotheses concerning the ease of processing the forms we tested (you, we, people); these are outlined below. While these hypotheses make predictions about the processing ease of sentences containing the different forms, they do not, in their current form, specify when exactly these effects emerge. Indeed, it is well-known that self-paced reading often yields spillover effects, i.e., situations where the expected effects do not surface on the target word but instead show up on subsequent words. 4 The lack of immediate effects seems especially likely in a design like ours where the sentences (2) do not involve disambiguation, garden-pathing or other unexpected input. Thus, we included a spillover region in our analysis and look for differences not only on the critical word but also on following words.

INDEXICALITY HYPOTHESIS.
Indexically-interpreted pronouns refer to highly salient referents, and do not require a comprehender to evoke or construct a new discourse referent or even a generic operator or referent. For example, when you is used indexically, its referent (the addressee) is automatically available thanks to the context of the speech situation. 5 However, to interpret the 4 For example, in a pronoun study on German also conducted via Prolific and self-paced reading, Hinterwimmer & Brocher (2018) typically found the clearest effects on the second and third words after the critical pronoun.
5 The observation that indexical pronouns like you and I refer to highly-salient referents is also reflected in the fact that languages with pro-drop -languages that allow (typically) subject pronouns to be omitted -allow first-and second-person pro-drop much more freely, in a less constrained fashion, than third-person pro-drop. Intuitively speaking, in all three cases, the referent needs to be sufficiently prominent for pro-drop to be licensed -and firstand second-person referents (speakers and addressees) are essentially always highly salient, in contrast to thirdperson referents mentioned in prior discourse that may no longer be very salient. In languages like English that do expression people, some kind of additional representation presumably needs to be evoked, and does not 'come for free' as part of the speech situation, unlike the speaker and addressee referents of indexicals. The special status of indexicals receives initial indirect support from the results of Warren & Gibson (2002, 2005 on the processing of indexicals in a different context. According to the Indexicality Hypothesis, this special property of indexicals is hard-wired into the representation of these forms -i.e., it is relevant even in contexts where they are not functioning as indexicals, such as the generic contexts we tested -which leads us to expect any expressions that can in principle have an indexical interpretation to be easier to process than forms that can never be indexical. This predicts that sentences with you and we would be processed faster than sentences with people (you, we < people).

PERSPECTIVE-TAKING HYPOTHESIS.
Although indexicals refer to salient referents, they are also perspective-sensitive: The referent of indexical we or you depends on who the speaker and the addressee are. Once the speaker changes, for example, the referent of we, when uttered by the speaker, also changes. According to the Perspective-Taking Hypothesis, if this perspectivesensitivity is hard-wired into our processing of certain pronouns (regardless of context), then -in light of work suggesting that perspectival processing, broadly construed, is cognitively costly (e.g. Keysar et al. 2000, Ferguson et al. 2017) -you and we may be harder to process (elicit longer reading times) than people which has no indexical component (people < you, we). This predicts essentially the opposite RT pattern to what the Indexicality Hypothesis predicts. (2), all three forms can receive generic interpretations in contexts like the COVID recommendations that we tested in this study. In other words, in the kinds of sentences we looked at, it's possible -and in fact very likelythat comprehenders construe you and we (and of course people) as not referring to any specific, identifiable individual (or set of individuals), but rather simply as referring broadly to 'people in general.' According to the Genericity Hypothesis, the preferred status/availability of this 'people in general' interpretation renders the indexical construals of we and you irrelevant or unavailable in this context. Thus, this hypothesis predicts no differences between the three forms, and they may be all be equally easy (or hard) to process (you = we = people).

Indexicality Hypothesis you, we < people
Perspective-taking Hypothesis people < you, we Genericity Hypothesis you = we = people Table 1. Predictions of the three hypotheses regarding ease of processing advice-giving sentences with you, we and people (left-to-right ordering reflects easier vs. harder to process) 6. Results. Raw reading time (RT) data are in Figure 1. Raw reading times in ms are shown in the figure, but statistical analyses were conducted on log-transformed RTs and on residual logRTs to control for differences in word length. As we will see, overall, we find that sentences with people elicit RT slowdowns relative to sentences with you or we, which pattern alike. This pattern appears to hold even when word length is taken into account. We suggest that these results provide preliminary support for the Indexicality Hypothesis.
not have pro-drop, first-and second-person pronouns are still expected to be very easily interpreted, because firstand second-person referents are so salient and highly accessible.
Let's take a closer look at the reading time patterns. In the log-transformed RT analyses, there are no effects of referential form before the critical region, and no effects at the critical region itself (you/we/people, "0" in Figure 1). The residual logRT analyses suggest that, at the critical region, once the consequences of word length on reading time are factored out, the people conditions are read relatively faster than the we conditions (t=1.975, p<0.05). However, the difference between people and you does not reach significance in the residual logRT analysis (it is marginal: t=1.758, p=0.079). A people/we difference in the absence of a reliable people/you difference is hard to interpret and is not fully predicted by any of the hypotheses being tested. 6 It's worth noting that, if we adopt |t| ≥ 2 as the criterion for significance instead of using lmerTest, the people/we difference is also not significant, but see Luke (2017) for discussion.
At spillover region 2, there are no hints of any significant effects in either analysis (p's>0.15).

Figure 1. Raw reading times (in ms). Error bars show +/-1 SE
In contrast, at spillover region 3, we see clear differences emerging. Now, the people conditions are significantly slower than the you conditions (logRTs: t=-2.85, p<0.005; residual logRTs: t=-2.708, p<.01) and also significantly slower than the we conditions (logRTs: t=-2.84, p<.005; residual logRTs: t=-2.73, p<.01). Thus, these differences persist even when we control for word length differences by analyzing residual logRTs. (Here, |t| ≥ 2, so these effects survive even if we follow that criterion for assessing significance.) We suggest that it is unlikely that this slowdown at spillover region 3 in the people conditions is simply due to lexical frequency differences between the less-frequent people and the relatively more frequent forms you and we. This is because the preceding regions show no significant differences that could be plausibly construed as slowdowns caused by low lexical frequency. Furthermore, at spillover region 3 we are already three words downstream from the critical word (region 0). Especially when coupled with the patterns in the preceding regions and the large slowdown observed in region 3, this seems to argue against pure lexical frequency driving the slowdown at this point. Rather, we suggest that the finding that sentences with people are read more slowly than sentences with you or we could be taken as preliminary evidence for the Indexicality Hypothesis. Admittedly, more work is needed to further assess this idea.
At spillover region 4, there is still a marginal slowdown in the people conditions relative to the you conditions in the logRT analyses (logRTs: t=-1.692, p=0.09; residual logRTs: t=-1.544, p=0.12) but no other differences. Spillover region 5 shows no significant differences in either analysis.
Overall, then, at spillover region 3 we find that conditions with the subject people elicit longer reading times than conditions with we or you in subject position. 7. General discussion. This study is a preliminary foray into the real-time processing of nonanaphoric pronouns and generic expressions, and focuses on the use of you, we and people in COVID-19 health messages. To the best of our knowledge, this is the first COVID-related selfpaced reading study to test how different forms (you, we, people) impact reading time, which we take to reflect ease of processing. Our results provide preliminary evidence for an increased processing load in public health messages with the non-indexical form people (relative to pronouns we and you) which we interpret as providing initial support for the Indexicality Hypothesis: The preliminary finding that health messages with people are processed more slowly than ones with we or you (even when we control for differences in word length) is compatible with the core idea of the Indexicality Hypothesis, which is that indexical linguistic expressions have some aspect of this special status 'hard-wired' into their meaning -so that it persists even in contexts where the forms are used generically/non-indexically -and can thus be processed more rapidly than expressions like people that are never indexical. To interpret people, some kind of additional representation needs to be evoked, and this representation does not 'come for free' as part of the speech situation, unlike the speaker and addressee referents of indexicals. The details of this line of reasoning still need to be worked out.
It is worth noting that you and we mostly pattern alike in our data, suggesting that they are equally easy to process in the kinds of contexts we tested. In light of their shared indexical 'potential,' this may not be surprising -and indeed it is predicted by the Indexicality Hypothesis. Open questions remain about how these results relate to experiments that did not collect processing data, including Tu et al.'s (2021) findings about you being better at influencing people's (hypothetical) behavior than we.
More broadly, we emphasize that these issues, including the hypotheses we proposed in this paper, need further investigation but can be challenging to test due to the intrinsic lexical differences between the conditions. Also, it's worth noting that, no matter what ultimately turns out to be the explanation for the differences in reading times, the fact that there do appear to be differences in reading times (which point to differences in ease of processing) suggests that this kind of research can have implications for the construction of easily-understood public health messages.
One important future direction concerns investigating whether individual differences in people's general attitudes about COVID as well as their individual attitudes about specific kinds of mitigation behaviors (masking, vaccination, etc.) influence reading times. Furthermore, independent of the COVID-related questions, experimental work directly comparing the real-time processing of indexical vs. generic uses of you and we would be very informative; in the present study, we intentionally used a context that strongly favored generic interpretations and thus our results do not speak directly to how these forms are processed when they are unambiguously indexical. Finally, in light of crosslinguistic variation in how different kinds of generic reference are expressed (see e.g. Siewierska 2008, see also Kaiser 2015 on Finnish vs. English), crosslinguistic experimental work would also be very welcome.
As a whole, the results reported in this paper can help further our understanding of how different non-anaphoric pronominal expressions -often neglected in prior models of pronominal processing -are processed in real time, and this kind of research potentially has practical implications for the construction of easily-understood public health messages.