Graded complementarity in the resolution of pronouns and reflexives

. This paper presents an experimental evaluation of how pronoun and reflexive resolution preferences in English vary across different syntactic environments. Five structures were tested: coarguments, picture noun phrases, prepositional phrases, coordination, and comparatives. Results show that reflexives display a general preference for structurally local antecedents, but the strength of the preference varies significantly by environment; pronouns display a similarly variable, but stronger preference for nonlocal antecedents. Our findings suggest that complementarity between pronouns and reflexives may be a gradient phenomenon, with the robust complementarity observed in coargument anaphora occupying the endpoint of a graded continuum.

Coordination represents a case of mixed complementarity: at least according to the judgments in the literature, local and nonlocal antecedents are available to coordinated reflexives (4-a), but coordinated pronouns still require nonlocal antecedents (4-b).Similarly, while reflexives embedded in comparative clauses can be resolved either to local or nonlocal antecedents (5-a), pronouns must be resolved nonlocally (5-b).The influence of semantic factors is particularly stark in comparatives: although we take local binding to be syntactically available in (5-a), the semantics of the comparative construction strongly disfavor this interpretation, since it is impossible to be taller than yourself.
(4) a. Gladys i said that Ethel j praised both Astrid and herself i/j .b. Gladys i said that Ethel j praised both Astrid and her i/ * j .
(5) a. Gladys i said that Ethel j was taller than herself i/j .b. Gladys i said that Ethel j was taller than her i/ * j .
The descriptive generalization characterizing the strength of complementarity in English appears to be that complementarity is strongest when an anaphor and its local antecedent are coarguments of the same predicate (see, e.g., Reinhart & Reuland 1993), as in (1), but not (2)-( 5).Environments in which the antecedents available to reflexives and pronouns are not complementary thus pose a challenge to classic binding theory, and have drawn interest for what they may reveal about the scope of the binding conditions and the grammar of anaphora (Zribi-Hertz 1989;Hestvik 1991;Pollard & Sag 1992;Reinhart & Reuland 1993;Safir 2004;Büring 2005;Reuland 2011;Charnavel & Sportiche 2016;Charnavel & Bryant 2022).
1.2.RESOLUTION PREFERENCES IN NONCOMPLEMENTARY ENVIRONMENTS.An extensive corner of the sentence processing literature has investigated the degree to which the empirical generalizations described by binding theory are also reflected in processing (Nicol & Swinney 1989;Clifton et al. 1997;Badecker & Straub 2002;Sturt 2003;Kennison 2003;Kazanina et al. 2007;Xiang et al. 2009;Chen et al. 2012;Dillon et al. 2013;Chow et al. 2014;Patil et al. 2016;Parker & Phillips 2017;Sloggett 2017;Kush & Dillon 2021).Although there is substantial evidence that structural information influences online and offline resolution, the majority of studies have focused on coargument contexts.The smaller number of experimental studies on anaphor resolution in noncomplementary environments (Keller & Asudeh 2001;Runner et al. 2006;Kaiser et al. 2009;Cunnings & Sturt 2014, 2018;Bryant 2022) have found that preferences are not entirely unrestricted in these structures, although complementarity is weaker.Cunnings & Sturt (2014) conducted a series of eyetracking while reading experiments comparing the resolution of reflexives in embedded coargument, PNP, and possessed PNP constructions, finding that participants preferred to resolve reflexives to the local antecedent in all constructions, with the strongest effect in the coargument condition.Cunnings & Sturt (2018) conducted a follow-up study with pronouns in the same constructions and found roughly the reverse effect: participants defaulted to nonlocal resolution for all pronouns, but the preference was strongest with coargument pronouns.Cunnings & Sturt conclude that structural information constitutes a highly weighted resolution cue, but the strength of the cue may vary by construction.Bryant (2022) conducted an offline sentence rating study on pronouns and reflexives in locative PPs, finding that participants rated reflexives highest in sentences with motion verbs and direct contact between figure and ground (e.g., Chloe poured some glitter on herself ), while the opposite pattern was found with pronouns; preferences were more graded compared to coargument controls.Bryant suggests that preferences are guided by event structure, with the preference for a reflexive stronger when the event structure involves self-directed action and more closely resembles a "reflexive event" of the sort prototypically expressed by coargument structures.Conversely, the less reflexive the event, the stronger the preference for a pronoun over a reflexive.1.3.THE CURRENT STUDY.In this paper we ask how comprehenders' offline resolution preferences vary across different noncomplementary environments.It is possible that there is a sharp coargument/non-coargument divide, with preferences unrestricted outside coargument environments.Alternatively, preferences could vary more finely across syntactic structures.We compare the extent to which comprehenders prefer local antecedents for reflexives and nonlocal antecedents for pronouns across five constructions: coarguments, PNPs, PPs, coordination, and comparatives.In contrast to prior studies testing resolution across constructions (e.g., Cunnings & Sturt 2014, 2018), we are interested in comparing how the same participants rate different constructions.This will address the question of whether noncomplementary environments vary and to what extent.
In Experiment 1, we test reflexive resolution preferences using an antecedent choice task that asks participants to choose between a local and a nonlocal antecedent for a reflexive embedded in one of the five constructions of interest.In Experiment 2, we test pronoun resolution preferences using the same task.To preview our results, we find evidence for general locality-based preferences outside coargument contexts: reflexives generally prefer local antecedents, and pronouns prefer nonlocal ones.However, these preferences are graded across structures and, especially in the case of reflexives, vary significantly.These results align with prior experimental literature in providing evidence for weaker complementarity in non-coargument structures.
2. Experiment 1: Reflexives.We conducted an antecedent choice task in which participants were asked to identify the antecedent for a reflexive across five different syntactic environments: coarguments, PNPs, PPs, coordination, and comparatives.
2.1.METHODS.90 native speakers of American English participated in an online experiment on the PCIbex platform (Zehr & Schwarz 2018).Participants were screened with a demographic survey and attention checks.Filler items that included animacy violations were used as attention checks; participants were excluded who marked 50% of such items or fewer as an 'Unnatural' English sentence.In addition, we excluded self-reported non-native speakers of American English.After excluding participants based on these procedures, 60 participants were included in the analyses reported below.
Gender congruence between a reflexive and two antecedents was manipulated in stimuli such as (6).The 2x2 design (following Sturt (2003); Cunnings & Sturt (2014)) crossed the gender features on the reflexive and the nonlocal antecedent, while keeping the gender of the local antecedent constant.This resulted in the following four conditions: a Local+/Nonlocal+ condition with two feature-matching antecedents (6-a), a Local+/Nonlocal-condition where only the local antecedent matched the reflexive's gender features (6-b), a Local-/Nonlocal+ condition where only the nonlocal antecedent matched (6-c), and a Local-/Nonlocal-condition with no featurematching antecedent (6-d).We use "+" and "-" signs to indicate gender (mis)match between the local/non-local antecedent and the reflexive.Critical items followed this design while varying the syntactic structure (7).Participants were asked to identify the antecedent for the reflexive among four choices: the local antecedent, the nonlocal antecedent, 'Someone else not mentioned in the sentence', and 'This sentence is not natural', as in Figure 1.Participants were instructed to pick 'This sentence is not natural' if the sentence was not an acceptable English sentence.Since English 3rd-person reflexives require sentential antecedents, we did not expect many 'someone else' choices, but included this option to ensure maximal similarity with Experiment 2, which tested personal pronouns instead of reflexives.
24 sets of experimental items were constructed per structure type for a total of 120 items overall.Coargument and PNP stimuli were adapted from Cunnings & Sturt's (2014) items, substituting proper names for their gender-biased definite descriptions.PP, coordination, and comparative stimuli were constructed using the same design.Each participant saw 4 critical items per structure type.3 practice trials and 26 filler items were also included.20 filler items were structurally similar to the critical items, but used pronouns instead of reflexives.The remaining 6 fillers included animacy violations (e.g., Charles finished *its own homework and then he helped Mary) and served as catch trials; we excluded participants who selected 'This sentence is not natural' on 50% or fewer of these trials.30 participants were excluded; it is possible that the high number of excluded participants had an increased tolerance for ungrammatical English sentences due to the presence of the ungrammatical Local-/Nonlocal-stimuli. 2.2.RESULTS AND DISCUSSION.Results are illustrated in Figure 2.For each structure type, we conducted separate mixed-effects logistic regression models using the lme4 package in R (Bates et al. 2015).Excluding comparatives (see below), for each structure type two analyses were conducted.In the first analysis (primary), we evaluated participants' choice of 'Local' response vs. the other three response options (i.e., we binned 'Nonlocal', 'Someone else', and 'Unnatural' responses into a single 'Everything else' category).We included effects of Local Match ('LMatch', -1 and Match: 1).Each analysis included the maximal random effect structure supported by the data (Barr et al. 2013).In an additional secondary analysis, we excluded trials in which participants selected the 'Local' response, and modeled participants' likelihood of choosing the 'Nonlocal' response relative to the other two remaining response options (i.e., we binned the responses 'Someone else' and 'Unnatural' into one category).This analysis contained only one predictor, nonlocal match (NMatch, two levels: Match or Mismatch).The maximal random effect structure supported by the data was included in the analyses.
For comparatives, since participants overwhelmingly preferred nonlocal antecedents, we modeled their 'Nonlocal' responses vs. everything else in the primary analysis (binning 'Local', 'Someone else', and 'Unnatural' responses into one category) .In the secondary analysis for comparatives, we excluded the 'Nonlocal' responses, and modeled the 'Local' responses vs. the remaining two response options (binning 'Someone else' and 'Unnatural' into one category).The variable Local Match was the only predictor in the secondary analysis (LMatch, two levels: Match or Mismatch).
Summaries of the statistical analyses for Experiment 1 are shown in Table 1.The primary analysis found significant effects of LMatch in coarguments (p<0.001),PNPs (p<0.001),PPs (p<0.001), and coordination (p<0.001),showing more local antecedent choices when there is an local feature-matching option.The primary analysis also revealed an effect of NMatch in PNPs (p<0.01) and coordination (p<0.001),showing fewer local antecedent choices in these structures when there was a nonlocal feature-matching antecedent.The secondary analysis found a significant effect of NMatch in PNPs (p<0.001),PPs (p<0.001), and coordination (p<0.001),but not coarguments (p=0.30),driven by the fact that in the first three constructions (but not in the coargument construction) there were more nonlocal choices when the nonlocal antecedent feature-matched the reflexive.For comparatives, the primary analysis found a significant effect of NMatch (p<0.001), but not LMatch (p=0.75), and the secondary analysis found no effect of LMatch (p=0.18).This was driven by the fact that in comparatives participants predominantly chose nonlocal antecedents when the nonlocal choice feature-matched the reflexive, and across all conditions in comparatives they rarely chose the local antecedent.
The significant effect of LMatch our primary analysis found in coargument, PNP, PP, and coordination conditions confirms that local antecedents are the preferred option for reflexives in these environments, while the significant effect of NMatch observed in PNP, coordination and PP environments suggests that the presence of a nonlocal feature-matching antecedent reduced the likelihood of participants choosing the local antecedent and boosted their choices of a nonlocal antecedent.Importantly though, we did not observe any NMatch effect for the coargument environment, neither in the primary nor in the secondary analysis.
Qualitative inspection of Figure 2 reveals that when the nonlocal antecedent had an effect, the rates at which participants chose nonlocal antecedents differ.In particular, in PNPs and PPs, the presence of a local match appears to gate the availability of a nonlocal antecedent: in both Local+ conditions, participants overwhelmingly chose the local option, only choosing the nonlocal option in the Local-/Nonlocal+ condition when it was the only feature-matching antecedent available.However, in coordination, rates of local and nonlocal responses were roughly equal; participants even chose the nonlocal option in the Local+/Nonlocal+ condition in over 50% of trials.
Nonlocal antecedents for reflexives in comparatives were overwhelmingly preferred whenever they were available.The primary effect of NMatch we observed in the comparative condition confirms that nonlocal antecedents are available to reflexives embedded in comparative clauses.We found no effect of LMatch in either the primary or secondary analysis.Although this result could potentially be interpreted as indicating that local binding is syntactically unavailable in comparatives, we prefer to appeal to the role of comparative semantics in antecedent choice: it is impossible to be taller than yourself.
The Experiment 1 results thus reveal a graded taxonomy of noncomplementary environments, in which nonlocal antecedents for reflexives are either unavailable (coarguments), available but dispreferred (PNPs, PPs), equally available (coordination), or the preferred option (comparatives).These results broadly align with Cunnings & Sturt's (2014) findings: reflexives have a general preference for local antecedents, but the strength of this preference is regulated by construction-specific factors.
Surprisingly, participants did not select the 'Unnatural' option at ceiling rates across any constructions in the Local-/Nonlocal-condition. Instead, rejection rates fluctuated between 50%-75%.It is possible that such high rates of feature-mismatching responses are due to a task effect: despite the availability of the 'Someone else' and 'Unnatural' response options, the antecedent choice task may have put substantial pressure on participants to bind the reflexive.Another possibility is that some participants lacked strong gender associations for the names in our experimental stimuli and were comfortable resolving reflexives to names whose stereotypical gender did not match the reflexive.
3.1.METHODS.88 native speakers participated in Experiment 2. Recruitment and screening was identical to Experiment 1. Data from 62 participants is reported below.
Experiment 2 was identical to Experiment 1 in the task and procedure, but the critical items were minimally modified to replace the reflexives with the corresponding pronoun (e.g., himself → him).That is, participants saw sentences such as (8).We used the same set of filler and practice items as in Experiment 1, but replaced the pronouns with reflexives; practice trials and animacy violation filler items stayed the same.( 8 2. The primary analysis found significant effects of NMatch in coarguments (p<0.001),PNPs (p<0.001),PPs (p<0.001),coordination (p<0.001), and comparatives (p<0.001),driven by the fact that participants made more nonlocal choices when there was a nonlocal feature-matching antecedent.The secondary analysis found significant effects of LMatch in PNPs (p<0.05),PPs (p<0.001), and coordination (p<0.01),but not coarguments (p=0.98) or comparatives (p=0.98).
The primary effect of NMatch observed across all 5 structures confirms that nonlocal antecedents are available to pronouns in every structure we tested, as predicted.The secondary effect of LMatch in the PNP, PP, and coordination conditions indicates that local antecedents are sometimes available to pronouns in these structures (but not coarguments or comparatives).As shown in Figure 3, however, local antecedents are still dispreferred, with rates of local choice hovering around 25% or less even in the Local+/Nonlocal-condition, where the local antecedent is the only feature-matching option.Unlike in Experiment 1, there was no construction in which rates of local and nonlocal choices were equal or in which the overall preference pattern was reversed.
The Experiment 2 results generally align with Experiment 1 in revealing graded complementarity across syntactic environments.Coargument and comparative pronouns exhibited a nearcategorical preference for nonlocal antecedents, while local antecedents were sometimes chosen in the noncomplementary PNP, PP, and coordination environments.Even in conditions where local antecedents were available, however, they were chosen far less frequently than the nonlocal option.We interpret this result as indicating that pronouns always prefer nonlocal antecedents, but the strength of this preference varies by construction (see also Cunnings & Sturt 2018).
As in Experiment 1, rates of 'Unnatural' or 'Someone else' responses were not at ceiling in conditions where no feature-matching antecedent was available, but instead hovered between 75%-90%.This finding reflects the pressure the forced-choice task puts on participants to find a sentential antecedent.Whereas in Experiment 1 participants' preferred unbound response was 'Unnatural', in Experiment 2 participants selected 'Someone else' at much higher rates, since pronouns unlike reflexives can always refer to sentence-external antecedents.

Discussion.
4.1.SUMMARY OF FINDINGS.The experiments reported in this paper provide evidence that comprehenders have broad, generally complementary locality-based preferences while resolving pronouns and reflexives.Participants generally preferred to resolve reflexives to local antecedents and pronouns to nonlocal antecedents even in syntactic environments where both options are presumed available.We also found that the nonlocal bias displayed by pronouns was less variable than the local bias displayed by reflexives, at least in the structures we tested.Although in Experiment 1 we were able to eliminate or even reverse (in the case of comparatives) the local bias for reflexives, nonlocal antecedents remained the preferred option for pronouns throughout Experiment 2.
We asked whether antecedent choice would be unrestricted outside coargument environments, or whether preferences would vary more finely across structures.Our results confirm that noncomplementary environments vary.Robust complementarity was observed only in the coargument condition.In PNPs and PPs, reflexives and pronouns generally preferred local and nonlocal antecedents respectively, but the dispreferred option was still available in both cases.Preferences diverged in the coordination condition: rates of local and nonlocal choices were roughly equal for reflexives, but pronouns maintained a strong nonlocal bias.Finally, in the comparative condition participants preferred nonlocal antecedents regardless of nominal form.4.2.IMPLICATIONS FOR THEORY?.It is interesting to observe that no currently known theoretical proposal straightforwardly aligns with our experimental results (to our knowledge).For instance, Reinhart & Reuland (1993) propose to restrict the binding conditions to coargument structures.This theory is compatible with our results insofar as only coarguments displayed strict complementarity, but it does not make clear predictions about noncomplementary environments; in particular, it does not predict that weaker locality-based preferences should persist in environments where the binding conditions do not apply.Alternatives to a coargument-based binding theory (Charnavel & Sportiche 2016;Charnavel & Zlogar 2016) preserve the configurational binding conditions proposed by Chomsky (1981;1986) and account for noncomplementarity by including a provision for logophoric exemption: reflexives can be bound outside their local domain when they mark the perspective of an agent.If logophoricity is taken to be a marked option, our finding that reflexives generally prefer local antecedents could potentially support this sort of theory, but it is not clear why logophoricity should be differentially available across syntactic structures.
Rather than taking our results as evidence for or against any particular syntactic analysis, we prefer to view them as contributing to a growing line of experimental work showing that language users have weaker preferences that mimic the effects of the binding conditions in environments where strict complementarity is not observed, at least in English (Cunnings & Sturt 2014, 2018;Bryant 2022).The finding that weaker complementarity persists in ostensibly noncomplementary environments is consistent with the view of Levinson (2000), who argues that categorical instances of complementarity reflect the diachronic grammaticization of pragmatic competition between nominal forms.If forms are always in competition even outside contexts where competition is grammatically regulated, we would expect to see weaker, more graded versions of the coargument pattern in non-coargument contexts.That prediction was borne out, especially with PNPs and PPs.
Experiment 1 found roughly equal local and nonlocal choices for reflexives in the coordination condition, while Experiment 2 found a stronger, but not categorical, nonlocal preference for pronouns in the coordination condition.This contrast may align with the consensus view that condition B, but not A, applies to coordinate structures (Reinhart & Reuland 1993).On the other hand, if coordination is a condition B environment, it is surprising that local antecedents were available at all in Experiment 2. The judgments reported in the literature are not uniform (Jacobson 2007 claims that coordination is not a condition B environment), and it is possible that speakers disagree on the status of coordination.4.3.WHY DO STRUCTURES VARY?.The structures we tested in this paper appear to form a graded continuum with respect to complementarity, with coarguments demonstrating strongest complementarity, and locality-based preferences otherwise fluctuating in strength.Our results raise the question of what factors regulate the strength of these preferences across constructions.We should also ask whether the same factors regulate preferences across as well as within constructions.
The question of which factors regulate gradience within a particular construction has been addressed by Bryant (2022), who argues that preferences in PP anaphora are influenced by event structure: the preference for a reflexive is greater the more closely the event structure resembles a prototypically reflexive event involving self-directed action.Although we find this proposal plausible for PPs, it is not clear that an analysis based specifically on event structure is general enough to apply across the full range of structures we tested, which included non-episodic sentences that do not describe events (i.e., comparatives).However, it is worth exploring the possibility that a more general analysis of this form, in which preferences in noncomplementary environments are driven by closeness to a prototype, could succeed in predicting the strength of preferences across constructions.
One possible such account would adapt Reinhart & Reuland's (1993) "reflexive marking" idea and analyze variation in resolution preferences as the result of uncertainty about argument structure, with coargument structures as the prototype.If there is gradience in how readily comprehenders parse an anaphoric expression and a local antecedent as coarguments of a single predicate (possibly at a higher, non-compositional level of representation), then the rates at which a given construction is parsed as a "reflexive predicate" could potentially play a role in predicting the strength of preferences across environments.An advantage of this proposal is that it would derive our semantic explanation for why nonlocal antecedents are always preferred in comparatives: comparatives cannot express reflexive predicates because their meaning is incompatible with reflexivity.

OTHER INFLUENCES ON RESOLUTION.
There are many factors unrelated to syntactic structure that influence the resolution of anaphora (Arnold 2001;Kehler et al. 2008).Future work on resolution in noncomplementary environments should aim to disentangle preferences based on structure from preferences based on coherence, thematic role, perspective (Sloggett 2017), inherent reflexivity (Smits et al. 2007), and other semantic and pragmatic factors that we did not manipulate or control for.
In particular, a recurring finding is that pronoun resolution is influenced by the referent's degree of activation, or prominence, in the comprehender's discourse model (Arnold 1998(Arnold , 2010)).We found in Experiment 2 that participants strongly preferred to resolve pronouns to the matrix subject.We interpreted this result as evidence for a structural bias in favor of nonlocal antecedents, but it could also be taken as a prominence effect (we thank Shannon Bryant for raising this point and subsequent discussion).Since our design did not vary the prominence of the matrix subject across constructions, this interpretation of the Experiment 2 results would not necessarily change our conclusions about how constructions vary, but it is possible that we would observe overall higher rates of local resolution if prominence were controlled for.Future work could address this issue by including context sentences prior to the critical items that boost the prominence of either the matrix or the embedded subject.

Conclusion.
This paper presented experimental evidence from two antecedent choice tasks testing pronoun and reflexive resolution preferences in coargument, PNP, PP, coordination, and comparative structures, finding evidence for generally complementary locality-based preferences, as well as significant variation across environments.These results suggest that complementarity between pronouns and reflexives may be graded across syntactic structures.

Figure 2 .
Figure 2. Proportion of antecedent choices in Experiment 1 by structure type.

Figure 3 .
Figure 3. Proportion of antecedent choices in Experiment 2 by structure type.
6) Experimental conditions a. L+/N+: {Timothy} knew that Mark had lost {himself } near the back of the store.b.L+/N-: {Miranda} knew that Mark had lost {himself } near the back of the store.c.L-/N+: {Miranda} knew that Mark had lost {herself } near the back of the store.d.L-/N-: {Timothy} knew that Mark had lost {herself } near the back of the store.PNPs: {Timothy / Miranda} knew that Mark kept a photo of {himself / herself } near the back of the store.c.PPs: {Timothy / Miranda} claimed that Mark had found a gun near {himself / herself } in a paper bag.d.Coordination: {Timothy / Miranda} claimed that Mark had impressed both Mark and {himself / herself } during the performance.e. Comparatives: {Timothy / Miranda} claimed that Mark was taller than {himself / herself } by six inches.

Table 1
Match or Mismatch), Nonlocal Match ('NMatch', two levels: Match or Mismatch), and LMatch by NMatch interaction.The variables LMatch and NMatch were sum coded (Mismatch: . Parameter estimates, standard errors, z values and p values from generalized linear mixed-effects models of antecedent choice in Ex. 1, with rows of interest highlighted in gray.two levels:

Table 2
) a. Coarguments: {Timothy / Miranda} knew that Mark had lost {him / her} near the back of the store.b.PNPs: {Timothy / Miranda} knew that Mark kept a photo of {him / her} near the back of the store.c.PPs: {Timothy / Miranda} claimed that Mark had found a gun near {him / her} in a paper bag.d.Coordination: {Timothy / Miranda} claimed that Mark had impressed both Mark and {him / her} during the performance.e. Comparatives: {Timothy / Miranda} claimed that Mark was taller than {him / her} by six inches.
3.2.RESULTS AND DISCUSSION.Results are shown in Figure3.Analysis was conducted using mixed-effects logistic regression models using lme4, with separate analyses for each structure type.For Experiment 2, we also conducted a primary and a secondary analysis.The pri-.Parameter estimates, standard errors, z values and p values from generalized linear mixed-effects models of antecedent choice in Ex. 2, with rows of interest highlighted in gray.