Inactive gap formation: An ERP study on the processing of extraction from adjunct clauses

Filler-gap (movement, extraction, displacement) dependencies are processed actively, i.e., comprehenders anticipatorily commit to an interpretation of the sentence before encountering bottom-up evidence. This suggests that comprehenders make structural commitments to how a sentence will unfold shortly after encountering a filler NP. However, the grammaticality of some filler-gap dependencies may depend on semantic and pragmatic features of the sentence that are not typically considered in studies on filler-gap dependencies. One particular case is extraction from adjunct clauses, in which the filler NP may only grammatically be understood as the object of a non-finite adjunct clause if the main verb is an achievement predicate (e.g., What coffee did you arrive [ drinking ]? Truswell 2011). We present evidence from an EEG study demonstrating that comprehenders do not actively construct filler-gap dependencies in constructions such as these. Instead, they “inactively” build the dependency, only after integrating semantic information about the adjunct clause into the sentence.

the surprisal of encountering this word. This increased processing cost is called the plausibility mismatch effect.
(2) This is the city that the author wrote enthusiastically about .
Although active dependency formation processes are well-demonstrated, the precise formulation of these processes has not been carefully described. For instance, they can be cast as a prospective search for a resolution site that is launched shortly after encountering the filler phrase, a re-ranking of preferred parses of the sentence (Frazier 1987;Frazier & Clifton Jr. 1989), or an anticipatory predictive process of upcoming syntactic/semantic material (Omaki et al. 2015;Aoshima et al. 2004;Chacón 2019). On this latter interpretation, the reason the plausibility mismatch effect is observed in sentences like (2) is because, after encountering the city, comprehenders build a sufficiently detailed analysis of the sentence in which the city is interpreted as the theme of the verb and binds a projected gap in the direct object position, (3). After encountering the verb wrote and incorporating it into this previously-constructed analysis, comprehenders detect the implausible interpretation. Importantly, the plausibility mismatch effect demonstrates that comprehenders make a (syntactic and semantic) structural commitment before encountering the resolution site, i.e., the plausibility of the interpretation of the dependency in the context of the sentence does not appear to assert an effect early in the processing of the dependency (see also Omaki et al. 2015). Indeed, theories of active dependency formation privilege structural properties over, e.g., probability, plausibility, or predictability (Pritchett 1991;de Vincenzi 1991;Aoshima et al. 2004 Filler-gap dependency processes are active, but they are also selective. Although filler-gap dependencies may cross indefinitely many words, (4-a), there are a set of grammatical configurations that they may not cross without a significant reduction in acceptability, referred to as syntactic islands (Ross 1967), (4-b). These phenomena are typically attributed to well-formedness conditions on the representation of the sentence (Ross 1967;Chomsky 1977Chomsky , 1981Huang 1982;Uriagereka 1999;Rizzi 1990Rizzi , 2013, but see Deane 1991;Kluender & Kutas 1993;Kluender 1998;Hofmeister & Sag 2010). Importantly, evidence of active dependency formation for filler-gap dependencies is not observed in island configurations. That is, comprehenders only actively consider filler-gap dependencies that are also perceived to be acceptable. This finding has been interpreted as reflecting a tight coupling between grammatical constraints and processing strategies (Phillips 2006(Phillips , 2013Yoshida et al. 2014;Omaki et al. 2015). If filler-gap dependency processing depends on fine-grained structural prediction mechanisms, as we envision, then such island-sensitivity can be characterized as a quick and accurate use of structural information to distinguish possible vs. impossible predicted structures.
(4) a. This is the apple that Jodie knows . . . [ that Ernie ate ]. b. *This is the apple that Jodie wonders [ whether Ernie ate ].
Although island phenomena are typically cast as syntactic constraints, (compositional and conceptual) semantic and pragmatic factors are observed to affect the patterns of acceptability (Goldsmith 1985;Lakoff 1986;Kehler 2002;Truswell 2007aTruswell , 2011Szabolcsi 2006;Ambridge & Goldberg 2008). Here, we specifically focus on the status of adjunct clauses. Traditionally, tensed adjunct clauses are thought to be syntactic islands (Huang 1982;Chomsky 1986;Cinque 1990;Uriagereka 1999), due to facts like in (5). However, Robert Truswell has identified a number of semantic factors that modulate the acceptability of extraction from adjunct clauses (Truswell 2007a(Truswell ,b, 2011. For instance, extraction from a tensed adjunct clause is perceived to be worse than extraction from an untensed one, (6). Among untensed adjunct clauses, extraction from a predicate denoting an activity (in the sense of Vendler 1967; eating the apples) is more acceptable when the main predicate is an achievement (e.g., arrive), rather than another activity predicate (e.g., work), as in (7). (6)  To account for such contrasts, Truswell proposed that extraction from an adjunct clause is grammatical if it's possible to construe the main and adjunct predicates as codescribing the same event. Formally, this may occur if the event variable introduced by the adjunct clause can be identified with the event variable introduced in the main clause (Kratzer 1996). For instance, extraction from tensed adjuncts, (6-b), is perceived to be worse than untensed adjuncts, because the event variable introduced by the adjunct clause is bound by the existential quantifier introduced with tense, (8), thus preventing event identification. More crucially for our study, the contrast in (7) is explained by appealing to the representation of achievement predicates. Following Higginbotham (2009), Truswell proposes that achievement predicates 2 con-tain an unbound event variable, corresponding to the events or state of affairs preceding the event of culmination lexicalized by the predicate, (9-a). Thus, the event variable introduced by a non-finite adjunct clause containing an activity predicate can be identified with this extra main clause event variable, yielding a macro-event, represented by E in (9-b). This mode of composition is unavailable when the main verb is an activity predicate, (9-c).  2 Truswell also distinguishes between true achievements and points. True achievements lexicalize a culimnation point, and imply a previous process (represented as the unbound event variable), whereas point lack this previous process. This can be seen in (i), adapted from (Truswell 2007b(Truswell , 2011 in which the true achievement reach the summit denotes the pre-culmination process when put in the progressive aspect, whereas the point noticing does not. The analysis sketched here only applies to true achievements. (i) a. I'm reaching the summit as we speak b.
I'm noticing the problem as we speak Although the form of explanation for the contrasts above focuses on the nature of the compositional semantics of the main and adjunct predicate, Truswell (2011) is cautious to observe that aspects of pragmatics and world knowledge are relevant for determining the island status of an adjunct clause as well. For instance, (10-b) is less acceptable than (10-a), because the event of dripping mud on a carpet is unlikely to precede an event of entering a home, given the world knowledge that carpets are usually indoors and not outdoors. Thus, aspects of the compositional semantics (tense, lexical aspect) appear to interact with world knowledge in determining whether a given filler-gap dependency is perceived to be acceptable. If these generalizations are robust, then this suggests that the island status of adjunct clauses is affected by factors that are not typically considered in theories of filler-gap dependency processing.
(10) a. This is the carpet that John exited the house [ dripping mud on ] b. This is the carpet that John entered the house [ dripping mud on ] In our work reviewed in the next section (Kohrt et al. 2018(Kohrt et al. , 2019, we have sought to characterize the psycholinguistic processes involved in understanding filler-gap dependencies crossing into adjunct clauses. Specifically, we were interested in seeing whether comprehenders were capable of leveraging aspects of the semantics of the main predicate to selectively project a gap in an adjunct clause. We wanted to know whether, before hearing the verb drinking in a sentence like (11), comprehenders actively considered an interpretation in which the coffee would then be interpreted as its object.
(11) John prepared the coffee that his best friend arrived at the office [ Adjunct CP drinking ] For sentences of this type, we failed to find the typical processing profile associated with fillergap dependencies. Instead, we found a novel pattern of results 3 , in which processing time increased for plausible filler-gap dependencies in configurations, i.e., there was an additional processing cost associated with a plausible interpretation, rather than an implausible one. We argued that this reflected bottom-up, "inactive" dependency completion, which we attributed to the unusual status of extraction from adjunct clauses generally. In this paper, we strengthen the empirical basis for this novel pattern of results by replicating this effect using electroencephalography (EEG), and relate it to the broader question of what kind of parsing strategies underpin filler-gap dependency processing.

Previous findings.
In previous work, we examined whether the reported contrast in (7) is reflected in real-time sentence processing. In contrast to much of the work on the processing of filler-gap dependencies characterizes the processes in syntactic terms, we sought to determine whether comprehenders were capable of using semantic features of the sentence to selec-tively project a gap in an adjunct clause. One hypothesis that we considered was that comprehenders pursued a permissive strategy, and considered every non-finite adjunct clause as a possible host for a gap. This predicted evidence of active dependency formation in all non-finite adjuncts, regardless of the identity of the main verb. Alternatively, we hypothesized that comprehenders could selectively apply active gap formation in non-finite adjunct clauses depending on the semantic aspects of the previous verb phrase. If so, we predicted evidence of active gap formation in non-finite adjunct clauses, but only if the previous verb was an achievement. This analysis would imply that comprehenders could quickly incorporate some semantic information into the analysis, and use this information to revise a predicted structure.
To test this, we conducted two self-paced reading experiments (reported in Kohrt et al. 2018 andKohrt et al. 2019), using a plausibility mismatch paradigm, exemplified in (12). We expected to find increased reading times associated with the implausible filler-gap dependencies (the report . . . drinking) compared to the +Plausible controls. We did not find evidence of a plausibility mismatch effect in the comparison of the levels of ±Plausibility within the levels of ±Composable. Instead, we found that processing time increased when the filler was a plausible argument, i.e., a "reverse" plausibility mismatch effect.
(12) a. +PLAUSIBLE, +COMPOSABLE John prepared the coffee that his best friend arrived at the office [ Adjunct CP drinking yesterday afternoon ] b. +PLAUSIBLE, −COMPOSABLE John prepared the coffee that his best friend worked at the office [ Adjunct CP drinking yesterday afternoon ] c. −PLAUSIBLE, +COMPOSABLE John prepared the report that his best friend arrived at the office [ Adjunct CP drinking yesterday afternoon ] d. −PLAUSIBLE, −COMPOSABLE John prepared the report that his best friend worked at the office [ Adjunct CP drinking yesterday afternoon ] As a post hoc interpretation of this result, we suggested that comprehenders do not initially attempt to build a filler-gap dependency into an adjunct clause. Instead, the semantic information of the filler phrase is maintained while the adjunct clause is integrated into the syntactic and semantic analysis of the sentence, (13-a). In the case of processing an achievement predicate, integrating the adjunct clause into the global analysis of the sentence then cues the construction of a macro-event, which may also require accessing information about world knowledge from memory, (13-b). If the filler NP is a suitable theme of the event described by the adjunct event and this resulting macro-event, then comprehenders construct a gap in a "bottom-up" manner, given that the main predicate-adjunct predicate complex codescribes a single event, (13-c). This extra step in accommodating an otherwise unlicensed gap triggers increased processing time.
Step 1 This contrasts with our model of typical filler-gap dependency processing in several ways. First, instead of anticipatorily building a syntactic and semantic analysis to accommodate the resolution of the filler-gap dependency, comprehenders construct a resolution site for the dependency only after processing some of the adjunct clause. Additionally, instead of using the presence of the filler-gap dependency to project a partial semantic structure that may result in an implausibility, we propose that comprehenders incorporate plausibility information first, while relating the (sub)vents described by the main clause and the adjunct clause, and then selectively resolve the dependency. In other words, whereas typical filler-gap dependency involve early commitment to specific structures which then can be either quickly confirmed or disconfirmed, processing filler-gap dependencies resolving into an adjunct clause is a slower process that is gated initially by building a semantic (and conceptual) representation. However, the fine-grained interpretation that we gave these results is underdetermined by the results themselves, e.g., there is no distinct signal in the data indexing initial surprisal that the verb does not match the initially predicted structure vs. the hypothetical later bottom-up gap completion process we described above. For this reason, we opted to use EEG, due to its increased temporal latency and its increased dimensionality which allows for distinguishing different phases of processing. Previous EEG studies on the processing of filler-gap dependencies show electrophysiological activity indexing the completion of the filler-gap dependency within several hundred milliseconds of encountering the verb, described in the next section. However, we show that the event-related potential (ERP) record for sentences like (12) do not diverge until significantly later than this, consistent with the inactive, meaning-driven, bottomup parsing strategy that we sketched above.

Experiment.
3.1. RATIONALE. The goal of the experiment is to determine whether comprehenders actively build filler-gap dependencies into adjunct clauses. Previous self-paced reading results failed to demonstrate evidence of active dependency formation, but instead yielded an increase in processing time for plausible sentences, which we interpreted as evidence of a bottom-up mode of composition. However, because self-paced reading data depends on a motor response that may not be closely time-locked with comprehension processes, it does not give a highresolution snapshot of the distinct phases of processing. Thus, we chose to conduct an EEG study, because this technique has higher temporal resolution. Plus, EEG provides different indices of processing ("ERP components"), which provide a more detailed picture of the underlying mechanisms.
Much of the discussion above has focused on the initial detection of the plausibility of a filler-gap dependency as detected in reading times. However, Garnsey et al. (1989) demonstrated a similar sensitivity to (im)plausibility of a filler-gap dependency in EEG. They found that, at the verb called, the amplitude of the N400, a negative-going deflection that peaks around 400ms after onset of stimulus, was larger in sentences like (14-a) than in (14-b), which they N400 P600 Later Active-gap formation Effect of −Plausible

Inactive-gap formation
Effect of +Plausible, +Composable Table 1. Predicted results on N400 and P600 component. attributed to the difference in plausibility of the filler-gap dependency at this point in the sentence. The N400 component has traditionally been understood to index difficulty in incorporating a word or phrase into the semantic representation of the sentence (Kutas & Federmeier 2011). However, this interpretation remains controversial, e.g., amplitude of the N400 component may instead index difficulty in accessing a lexical item, with sentence-level predictions facilitating or inhibiting these processes (Lau et al. 2008). Regardless of the precise interpretation of this component, this result demonstrates that in typical active dependency formation situations, comprehenders have incorporated some information about the relation between the filler phrase and the verb before 200-400ms post-onset of the verb.
(14) a. The businessman knew which article the secretary called at home b. The businessman knew which customer the secretary called at home Another component that has been identified as reflecting filler-gap dependency processes is the P600, a positive deflection that typically begins around 600ms after onset of the stimulus, which is observed in many contexts that trigger processing difficulty (King & Kutas 1995;Kaan et al. 2000;Phillips et al. 2005;Gouvea et al. 2009). For instance, the P600 is observed upon completion of a filler-gap dependency, and the latency of the component is affected by the length of the dependency (Phillips et al. 2005). Thus, differences in the latency and amplitude of the P600 component may reflect differences in whether or when a filler phrase is integrated into the gap site. If comprehenders are capable of actively completing a filler-gap dependency into an adjunct clause, then we predict that the amplitude of the N400 component should increase for implausible filler-gap dependencies over plausible ones, replicating the early sensitivity to plausibility reported by Garnsey et al. (1989). Similarly, if comprehenders only selectively build filler-gap dependencies into adjuncts if the previous verb was an achievement predicate, then we predicted an effect of ±Composability on the P600 component. By contrast, if comprehenders complete filler-gap dependencies in the bottom-up fashion sketched in (13), then we predict that the ERP signal of our sentences should be similar until significantly later, e.g., after 800-1000ms. That is, if the psycholinguistic processes involved in resolving a filler-gap dependency into an adjunct clause are dependent on integrating the meaning of the adjunct clause into the analysis first, then we predict that the effect should follow the traditional languagerelated ERP components associated with processing the adjunct clause verb. These predictions are sketched in Table 1. 3.2. MATERIALS. We prepared 144 sets of items, with 4 items per set, using the same design as presented in (12). Each item consisted of a preamble (John prepared), followed by a filler NP that was either a plausible theme of the verb located in the adjunct clause (the coffee, +Plausible) or not (the report, −Plausible). Following this, there was a relative clause pronoun (that), followed by a 3-word subject NP (his best friend), then either an achievement verb (arrived, +Composable) or an activity or accomplishment verb (worked, −Composable). Afterwards, there was a three-word long PP (at the office). We tried to keep these PPs the same across conditions within each item set, but this was not possible in all items due to the argument structures of the previous verb. However, the final word in this PP was the same across all conditions. Afterwards, the critical region was the non-finite adjunct clause verb (drinking), following by a 2 or 3 word spillover region (late this afternoon), which were identical across conditions within each item set.
The verbs that we selected for the main verbs and for the adjunct verbs were controlled, such that the +Composable verbs were always paired with the same −Composable verbs across item sets. The pairs of main verbs were arrive/work, reach/wait, leave/chat, enter/play, exit/ride, and go/watch. This meant that there were 24 sets of items associated with each pair of verbs. We also sought to make the identify of the adjunct verb less predictable by counterbalancing which non-finite adjunct verb was paired with which main verbs. There were six verbs we used for this, drinking, reading, singing, drawing, eating, knitting. These verbs were selected because they are equally interpretable as a transitive verb (i.e., with the filler NP as its direct object) or as intransitive. This design choice was made to avoid forcing participants to interpret the filler-gap dependency as resolving with this verb solely due to its transitivity requirements. Counter-balancing these features across all items resulted in 5 sets of items per pair of adjunct verbs and main verb pair.
We also included 256 complexity-matched fillers, with 50% containing minor grammatical violations, semantic anomalies, or both. The items were distributed in a 2 × 2 Latin Square design, such that each participant saw one item from each set of items, and all the filler items.
3.3. METHODS. Participants (N = 24) from the University of Minnesota community sat in a dimly lit, sound-proof room in front of a computer at the Center for Advanced Sensory Sciences (CATSS). We affixed a 32 Ag/Cl electrode BioSemi cap to the participant's scalp, and applied conductive gel to keep each electrode's resistance below 40 kΩ. Afterwards, participants were instructed to silently read sentences displayed on the screen. The sentences were displayed in a rapid serial visual presentation (RSVP) design, in which each word automatically displayed one at a time for a fixed duration of 500ms, with an 83ms inter-word interval. The last word of each sentence lasted for 900ms. Electrical potentials were recorded in real-time, referenced to two electrodes affixed to the mastoids. Twenty percent of trials were followed by a comprehension, and participants responded using the keyboard to provide a response.
Participants were given feedback on the screen if they made an incorrect answer. After the other 80% of trials, participants were instructed to press a key to advance to the next sentence. Each sentence was preceded by a fixation cross which also appeared for 500ms. Sentences were displayed in a random order, in 4 distinct blocks. From start to finish, the task took approximately 90 minutes. Participants were allowed to take breaks between each block, and they communicated with the researcher through a baby monitor. Participants were compensated 20USD per hour for their time.
3.4. RESULTS. Before analysis, we applied a 1Hz high-pass FIR filter to the raw EEG signals. Then, we extracted epochs at each verb in the target sentences. The epochs ranged from 100ms before onset of the adjunct verb (drinking) to 1600ms afterwards, which contains the next two words in the spillover region. The epochs were baselined to the average potential 100ms before onset of the adjunct verb. This word was identical within each item set across conditions. Then, we automatically rejected epochs that exceeded a peak-to-peak threshhold of 150 mV. Three subjects had a large artefact rejection rate, and thus were excluded from analysis. Two additional subjects were removed due to poor recording quality. After epoch rejection, we then applied ICA correction to remove eye-blinks and other periodic artefacts.
In the N400 time window, the first mixed effects model indicated that voltage was more negative over the entire scalp for +Plausible filler-gap dependencies (β = −0.80, SE= 0.39, t = −2.0, p = 0.04), and +Plausible, −Composable sentences had more negative voltage in posterior regions (β = −1.59, SE= 0.79, t = −2.0, p = 0.04). However, no post-hoc pairwise comparison conducted on this model within any of the factors was significant (p > 0.05). Similarly, the mixed effects model fit to the midline electrodes had no significant intercept in this time window (p > 0.05). Thus, although there were significant intercepts in the first model, it is difficult to interpret this as a traditional N400 effect. This is because voltage was more negative for plausible filler-gap dependencies, whereas typically N400 amplitude is more negative for words that are implausible, surprising, or unpredictable given the context. Additionally, the N400 effect is typically observed over midline electrodes, but the significant interaction intercept in the model suggested that the effect was driven by posterior electrodes, which is an atypical scalp distribution for an N400 effect. The lack of traditional N400 effect is underscored by the lack of significant effect in the midline electrode model, where the effect would be expected.
In the P600 time window, there were no significant intercepts in the first model (p > 0.05). However, the midline model showed that voltage was more negative for −Composable conditions (β = −1.33, SE= 0.47, t = −2.9, p < 0.01). Pairwise comparisons revealed that voltage was more positive for +Composable conditions over −Composable conditions within both levels of ±Plausible (p < 0.05), suggesting that this is an effect of the combination of the two verb phrases, and thus does not necessarily reflect any interaction between the plausibility of the filler-gap dependency and the identity of the previous verb phrases. Thus, this apparent P600 effect may reflect processes relevant to syntactically or semantically combining the adjunct clause to the preceding clause, not necessarily the integration of the filler NP into the analysis.
Afterwards, we conducted a cluster-based permutation test. This is a non-parametric test Figure 1. Average voltage by region across conditions. Error ribbons correspond to two standard errors from the mean. Grey areas correspond to N400 time-window (300-500ms) and P600 time-window (600-800ms). Stars indicate that pairwise comparisons conducted on linear mixed effects model revealed a significant effect in this time window. Voltages were low-pass filtered at 20Hz for graphing purposes only.
that seeks to find contiguous points in time and space (electrodes) that are exchangeable, then conducts the significance testing within this cluster (Maris & Oostenveld 2007). This technique is useful when there is no specific spatial or temporal region that we expect to find an effect in a priori. We conducted cluster-based permutation tests in 4 distinct time windows from the onset of the verb, 500ms length each, excluding the first 10ms from onset of the critical verb to and the last 10ms of the final time window. The permutation test had a minimum time of 10ms for the clusters, an α-level of 0.05, and used 5000 samples. We found a cluster that ranged from 1236-1309ms over 21 electrodes. Spatially, this cluster was distributed over most of the scalp, but had the strongest effect in the posterior left quadrant, around electrode CP5. Within this cluster, there was a marginally significant interaction between ±Plausibility and ±Composability (p = 0.09). Although this cluster was only 70ms long, visual inspection revealed that many electrodes appeared to demonstrate a sustained positivity for +Plausible, +Composable conditions throughout this time window, which may peak in the left posterior  region. This is graphed in Figure 2. Thus, it appears that the effect of processing a plausible filler-gap dependency into an adjunct clause is sensitive to the choice of previous verb, and that these processes are detectable more than two words after the onset of the verb, consistent with an analysis in which comprehenders construct the dependency in a "bottom-up" fashion, as we argued previously.

4.
General discussion and conclusion. To understand a sentence like (11), we argued that comprehenders do not deploy the typical active filler-gap dependency processing strategies. Instead, comprehenders use an "inactive" strategy, in which the syntactic and semantic characteristics of the adjunct clause are first computed, and then the filler-gap dependency is resolved on the basis of these features. We argued for this processing strategy previously on the basis of behavioral data (Kohrt et al. 2018(Kohrt et al. , 2019, however, the low resolution of the self-paced reading technique did not provide a clear enough snapshot to support this analysis. In this paper, we used EEG, which provides a clearer snapshot into these processes, due to its higher temporal resolution and the higher dimensionality of the EEG data. Consistent with our proposal, we failed to find any evidence of active filler-gap dependency formation in the typical languagerelated ERP components, the N400 and P600. Instead, the effect of extraction from an adjunct clause was observed significantly later, more than 1200ms after the onset of the critical verb. Despite this, we remain cautious in our interpretation of the results, because there are some methodological and theoretical concerns related to the design of this and our previous studies. First, the experiment presented in this paper has a low sample size. Although we conducted the study with 24 participants, we were only able to report on 19, due to data hygiene issues. This means that higher-powered studies will be required in the future to further strengthen our claims.
Secondly, in all four of our target conditions, the filler-gap dependency had no grammatical resolution. Thus, comprehenders may had been forced to interpret the filler phrase as an argument of the adjunct verb regardless. Because of this design issue, we have no condition that independent tests for the effect of integrating the adjunct clause with the main predicate, separate from interpreting the filler-gap dependency. That is, we have no measure of the brain activity of associating a main clause achievement predicate with an activity adjunct clause vs. two activity predicates. Future studies will need to incorporate a condition with no adjunct clause to more directly isolate this effect.
Similarly, although we attempted to control for the transitivity of the critical adjunct verb (drinking), we did not carefully consider the effect of transitivity of the previous verb (arrived/worked). This is perhaps an unavoidable confound, since argument structure is heavily correlated with lexical aspect (e.g., Levin 1993;Borer 2005). It is possible that there are differences in processing strategies deployed due to these syntactic differences in the previous clause that do not specifically reference the semantic/conceptual aspects of the clause. Strengthening our claims would require more carefully testing this hypothesis, or finding a new way to remove this confound.
Moreover, the design in this experiment was intended to maximally mirror the design of our self-paced reading experiments, in which we strived to have sufficient spill-over regions between the critical words. However, this introduced a number of issues. First of all, it increased the length between the filler-phrase and intended resolution site. Increased dependency length is shown to diminish the sensitivity to plausible filler-gap dependencies, i.e., comprehenders show a later or smaller plausibility mismatch effect for longer dependencies than shorter dependencies (Wagers & Phillips 2014;Chow & Zhou 2019). Similarly, because our filler-gap dependencies were relative clauses that modified an object of a higher clause (John prepared the coffee that. . . drinking late this afternoon), this introduces an attachment ambiguity that may affect the EEG results. The non-finite adjunct clause may modify either embedded clause (arrived/worked) or the main clause (prepared). Although we assume something like the Late Closure strategy should strongly bias participants to attach the adjunct clause to the embedded clause (Frazier & Fodor 1978), the availability of the alternate parse may contribute additional processing difficulty. This is critical to the issue of filler-gap dependency resolution, since no c-command relation holds between the filler NP and the adjunct clause if it is attached at the main clause level, which is required if comprehenders are to consider the adjunct clause as a potential resolution site for the filler-gap dependency. On-going follow-up work eliminates these confounds by using a much simpler design.
Finally, a word of caution about the statistical analyses of our results. Perhaps the most compelling result is the lack of effect of our factorial design on the N400 and the P600 components. However, as always, null results should be treated skeptically, particularly with low sample sizes as reported in the current experiment. Similarly, as Sassenhagen & Draschkow (2016) point out, the cluster-based permutation test is not a significance test of the spatial or temporal location of an observed effect. The cluster-formation phase of the test is only used to identify clusters for computing the test statistic. In other words, the timing and scalp distribution of the effect may be different in possible follow-up experiments. Thus, these findings should only be taken as a starting point for the study of how semantic information interferes in active dependency formation processes.