Bootstrapping where? Location changes disrupt intransitive verb mapping

. Children utilize a range of cues in verb learning. The current studies explore children’s weighting of two different cues for verb-meaning: number of syntactic arguments and event location. Naigles ( 1990 ) demonstrated that children use syntactic bootstrapping in mapping transitive and intransitive verbs—they hypothesize that intransitive verbs refer to one-participant events and transitive verbs refer to two-participant events. Previous work also indicates that children are sensitive to the location that an event took place when they are learning new verbs. The current study builds on this work, exploring the role of location changes in mapping new transitive and intransitive verbs. We find that children robustly link transitive verbs to two-person events but are weaker overall weaker linking intransitive verbs to one-person events. Children are, however, less likely to link intransitive verbs to one-person events when the event location has changed, suggesting that they are influenced by background changes when interpreting intransitive verbs.


Introduction.
Children are tasked with synthesizing various meaning cues when mapping new words.Verbs are particularly challenging, as they refer to events that extend across time, and require nouns as arguments.Children have been shown to utilize a range of cues in verb learning.The current studies explore children's weighting of two different cues for verb-meaning: number of syntactic arguments and event location.Naigles (1990) demonstrated that children use syntactic bootstrapping in mapping transitive and intransitive verbs-they hypothesize that intransitive verbs refer to one-participant events and transitive verbs refer to two-participant events.Previous work also indicates that children are sensitive to the location that an event took place when they are learning new verbs.When interpreting novel motion verbs, children are in-fluenced by a background change (Smyder & Harrigan 2021) and are most likely to be influenced by more visually salient background changes (Benjamin & Harrigan 2023;Harrigan et al. submitted).The current study builds on this work, exploring the role of location changes in mapping new transitive and intransitive verbs.In a series of two forced-choice tasks, we find that that 4-7 yearold children can override a location change when mapping transitive verbs, but are influenced by background changes when interpreting intransitive verbs.

2.
Background.This work investigates several cues children utilize in mapping new verbs: the number of syntactic arguments a verb takes, and the location of the event to which the verb is referring.Previous experimental work investigating each of these cues lays the groundwork for our studies.
2.1.SYNTACTIC BOOTSTRAPPING.Verbs present a particularly challenging learning problem for children acquiring language.The events to which they refer may be conceptually complex, extending across time and space, and potentially involving multiple participants.Linguistically, they also require the child to keep track of multiple components-the verb, which references the event, and the noun(s) that serve as the verb's argument(s), which usually reference the event's participant(s).This linguistic complexity can serve as both a challenge and also as a support in verb learning.While on one hand more is required of the child, on the other hand, the child can potentially make use of the linguistic information present.A range of studies illustrate the child's ability to utilize syntactic structure and distribution for mapping verb meanings.In a classic study by Naigles (1990), 24-month-old infants are exposed to two participants, a duck and a bunny, performing two simultaneous events-a one-person and a two-person event.For example, both participants are circling their arms, and at the same time the duck is bopping the bunny in the head and causing her to squat up and down.When children hear this set of events named with a novel verb in a transitive structure, like (1), they are more likely to interpret the novel verb as referring to the two-person event (bopping).When they hear the set of events named with a novel verb in an intransitive structure, like (2), they are more likely to interpret the novel verb as referring to the one-person event (the arm circling).
(1) The duck is blicking the bunny.
(2) The duck and the bunny are blicking.
2.2.The ability to utilize syntactic structure has been referred to as syntactic bootstrappingchildren are able to "bootstrap" themselves into meaning hypotheses from the structure.Studies have illustrated children's ability to utilize structure across various age groups and methodologies, and even the ability to utilize syntactic distribution over time (Pinker 1989, Lidz et al 2003, Gleitman et al 2005, Scott & Fisher 2009, Yuan & Fisher 2009, Fisher et al 2010).And, while most of the work on syntactic bootstrapping focuses on the relatively straightforward connection between number of arguments and number of event participants, studies have also demonstrated that children can use syntactic properties to hypothesize more complex meaning components for more abstract verbs, such as attitude verbs which refer to mental states (Asplin 2002, Papafragou et al 2007, Harrigan et al 2019).The current work builds on the robust finding that children are sensitive to the number of syntactic arguments that a verb has when hypothesizing the number of participants that that verb's event requires.

EVENT LOCATION AS A CUE TO VERB MEANING.
In addition to the linguistic cues, many other cues are present during verb mapping, including a range of language-external factors.For example, the events that verbs denote potentially take place in specific locations which may be important to verb meaning.For example, swimming and diving require their events to take place in the water, while hiking and caving must take place outside.In some cases, words refer to objects or events which are likely to take place in canonical locations; blenders belong in the kitchen, sleeping usually occurs in a bedroom.For some verbs, however, the relationship between event location and verb meaning goes beyond this-swimming is simply not swimming if it doesn't occur in the water.Children may be aware of the potential link between event location and verb meaning.Several previous studies indicate that children are sensitive to event location when mapping new motion verbs, and that they even consider the salience of the location when deciding whether to encode it as relevant.
A verb learning task reported by Smyder & Harrigan (2021) tests the role of agent identity and event location in mapping novel motion verbs.In a forced-choice task, they pit motion types (manner, path) against the language-external factors of agent identity and event location.They find that 4-7 year-old children are differentially sensitive to these factors in mapping verbs: they are more likely to be lured by the location of an event than they are by the identity of the agent performing the action.They hypothesize that children may be aware of the potential relevance of event location to verb meaning.Another study by Benjamin & Harrigan (2023) expands on this work.In another forced-choice task, 4-7 year-olds are shown novel motion verbs, this time pitting motion type (manner, path) against a location change.In this task, however, they manipulate the salience of the event's location.They find that children are more likely to be lured away from mapping manner when the event has been introduced in a more salient locations, such as suspended in air or underwater, compared to when the event has been introduced in a neutral location, such as in a living room or a yard.They conclude that not only are children aware that event location may be relevant to a verb's meaning, but they are differentially sensitive to location based on the saliency of that location.
The current study extends on these findings, further investigating the sensitivity to event location.In this study, we pit event location against number of syntactic arguments.We find that children link transitive structures to two-person events more strongly than they link intransitive structures to one-person events, and that they are more likely to be disrupted by location changes when mapping intransitive structures.Our findings also show sensitivity to location changes, although our findings diverge somewhat from findings of Benjamin & Harrigan.We find that children are not differentially sensitive to location changes based on the location's saliency.Our findings extend both the syntactic bootstrapping and the location cue literature, contributing to a clearer picture of how children integrate conflicting cues in verb learning.

Study 1a.
Our main study investigates how children weigh the cue of number of syntactic arguments against an event's location when mapping verb meanings.This study was modeled after the design of Naigles 1990.We investigate older children's ability to use syntactic bootstrapping to learn transitive and intransitive verbs in a forced-choice task, but we also introduce event location as a possible lure for verb meaning.
3.1.PARTICIPANTS.Participants were 15 monolingual English-learning preschool and elementary school-aged children aged 4;2-7;10 (mean = 6;1) recruited from the Williamsburg, VA area.Children were recruited via the William & Mary Child Language Lab database, online networking, and through local preschools.Fourteen children were tested remotely over Zoom due to the COVID-19 pandemic, and one child was tested in person at a local preschool.Participants were only run if they gave verbal assent and the researchers had received a completed consent form from the parent or legal guardian.
3.2.PROCEDURE.Each of the participants are shown the same series of short videos, consisting of 12 trials.Two pseudo-randomized orders of trials are randomly assigned to participants.For participants completing the task remotely, videos are displayed via screen sharing over Zoom, for participants completing the task in person, videos are shown on a laptop situated in front of the child.All children indicate their response by pointing.We ensure that the camera is situated to allow researchers to code pointing responses before the first trial begins.All remote participants are run in a relatively quiet space in their home, usually with a parent close by, and in person participants are run in a relatively quiet room in their preschool.All participants are given the option to stop the activity at any time.

DESIGN & MATERIALS.
The study utilizes a 2x3 design.We manipulate within subjects VERB TYPE (transitive v intransitive) and ENVIRONMENT SALIENCY (neutral v unknown v salient).The video stimuli were edited using FinalCutPro.Actors were filmed performing actions in front of a green screen so that the background of the video could be removed.Each trial was assigned to one of six backgrounds-two each across the three saliency categories.We chose two backgrounds in each BACKGROUND SALIENCY category: low (inside, outside), medium (desert, snow), and high (underwater, sky).Backgrounds were chosen for these categories based the backgrounds reported by Benjamin & Harrigan (2023), which were based on norming studies with adults.See The same two actors were used throughout all trials, but the agent and patient roles in two-person events were counterbalanced.During familiarization, participants saw two characters performing two simultaneous actions on one of the six backgrounds: a one-person event and a two-person event.For example, one character is bopping the other on the head while both characters are simultaneously circling their arms (Figure 1).

Figure 1. Sample familiarization event
The event was named with one of two VERB TYPE conditions: a transitive structure (3) or an intransitive structure (4).
During the test phase, children saw two videos played simultaneously: one shows the one-person event and the other the two-person event.
Children were asked to point the video that shows the novel verb (5).
The distractor event (the event that does not match the familiarization phase syntax) takes place in the same location as the familiarization.This forces children to choose whether the best matching event is the one that matches the number of arguments in the syntax (VERB TYPE response: figure 2, left), or the one that matches the background in which the original event took place (BACKGROUND SALIENCY response: figure 2, right).This suggests that children are still able to successfully map transitive structures to two-person events, even in the face of a changing event location, but that they are unable to map intransitive structures to one-person events.Breaking the results out by BACKGROUND SALIENCY, we find that the level of background saliency does not impact the overall accuracy of the transitive and intransitive conditions: transitive items a high level of accuracy across all levels of BACKGROUND SALIENCY, and intransitive items have a low level of accuracy (Table 4, Figure 4).3.5.STATISTICAL ANALYSIS.Statistical analyses support the findings.The results were analyzed using a generalized linear mixed effects model, which is a model appropriate for analyzing categorical data (Baayen 2007;Jaeger 2008).The reported models have random intercepts.These models predict the probability of a specific response (a correct answer) across different conditions (see Agresti 2002, Jaeger 2008).We ran a mixed-effect logit model with correct response as the dependent measure, with VERB TYPE (transitive, intransitive) and BACKGROUND SALIENCY (neutral, unknown, salient) as fixed effects, and SUBJECT as a random effect.We find a main effect of VERB TYPE [X 2 (1)=14.72,p=0.001], indicating that children's accuracy is significantly different across the two VERB TYPE conditions: they are more adult-like in mapping transitive verbs, but are distracted by location changes in mapping intransitive verbs.We find no main effect of BACKGROUND SALIENCY [X 2 (2)=0.03,p=0.99], and no interaction between VERB TYPE and BACKGROUND SALIENCY [X 2 (2)=0.47,p=0.79], indicating that this effect does not differ based on the saliency of the background.
3.6.STUDY 1A DISCUSSION.We find that older children are disrupted by background changes in mapping intransitive structures to one-person events regardless of the saliency of the event background, but are successfully able to link transitive structures with two-person events even in the face of location changes.There are several possible explanations for the asymmetry in influence of background.It is possible that the background change is disrupting children in mapping intransitive but not transitive structures due to a more fragile link between intransitive structures and one-person events.It is also possible, however, that there is an overall bias for choosing the two-person events in the context of our study, which would manifest even without the location change manipulation.This could be driven by methodological factors, such as imbalanced items, or by pragmatic factors.A follow-up study aims to clarify this question by using the items, methodology, and subject population for the current study, but without the BACKGROUND SALIENCY manipulation.

Study 1b.
As a follow up to study 1a, we ran a study modeled directly after Naigles 1990, investigating older children's ability to use syntactic bootstrapping to learn transitive and intransitive verbs in a forced-choice task.To better understand the role of the location change in the findings of study 1a, we wanted to run a version of the task that used the same items and methodology, but did not include the BACKGROUND SALIENCY manipulation.
4.1.PARTICIPANTS.Participants were 16 monolingual English-learning preschool and elementary school-aged children aged 4;1-7;5 (mean = 5;6) recruited from the Williamsburg, VA area.Children were recruited via the William & Mary Child Language Lab database, online networking, and through local preschools.Twelve children were tested remotely over Zoom due to the COVID-19 pandemic, and four were testing at a local preschool.Participants were only run if they gave verbal assent and the researchers had received a completed consent form from the parent or legal guardian.
4.2.PROCEDURE.The procedure is identical to that of study 1a, described in §3.2.

DESIGN & MATERIALS.
The design and materials are identical to study 1a, described in §3.3, except that the ENVIRONMENT SALIENCY factor was eliminated.VERB TYPE (transitive v intransitive) was manipulated within subjects.Because the stimuli from study 1a was filmed in front of a green screen, we could easily use the same videos in study 1b, but edit them to be on a neutral background throughout all familiarization and test items.Like in study 1a, in the familiarization, participants saw two characters performing two simultaneous actions: a one-person event and a two-person event.For example, one character is bopping the other on the head while both characters are simultaneously circling their arms (Figure 5).

Figure 5. Sample familiarization event
The event was named with a transitive structure ( 6) or an intransitive structure ( 7).
During the test phase, the child sees two simultaneous videos on the screen, and must choose which one is the best example of the new verb depicted in the familiarization trial.The child is prompted to choose which one is the best example of the verb used to name the events during familiarization ( 8).
The two test videos tease apart the two simultaneous events presented in the familiarization phase: one depicts the one person event (Figure 6, left), and the other depicts the two-person event (Figure 6, right).
Figure 6.Sample test events.
Children saw six transitive and six intransitive trials, for a total of 12 trials per child (see table 5).

VERB TYPE
# of items transitive 6 intransitive 6 Table 5. Study 1b design 4.4.PREDICTIONS.We predict that, like the infants in Naigles' 1990 study, children will be more likely to choose the two-person event when they hear the familiarization named with the transitive structure (6), and that they will be more likely to choose the one-person event when the familiarization is named with the intransitive structure (7).Critically, we are interested in determining whether the two-person bias observed in study 1a persists, or whether that response pattern was driven by the location change lure, and therefore can be attributed to an asymmetry between the influence of location changes on transitive vs. intransitive mappings.
4.5.RESULTS.Children's responses were coded as the child delivered the response by the experimenter.We find that children are more accurate at identifying the two-person event in the transitive condition, than they are at identifying the one-person event in the intransitive condition (Table 6, Figure 7).Even without the lure of background change, we still see a bias toward choosing the two-person event, although the effect here is weaker (table 7, figure 8).The results were again analyzed using a generalized linear mixed effects model.We ran a mixed-effect logit model with correct response as the dependent measure, with VERB TYPE (transitive, intransitive) as a fixed effect, and SUBJECT as a random effect.We find a main effect of VERB TYPE [X 2 (1)=28.62,p<0.001], indicating that children's accuracy is significantly different across the two VERB TYPE conditions: they are more adult-like in mapping transitive verbs than intransitive verbs.We also use a statistical analysis to compare the results of studies 1a-b.We ran a mixed-effect logit model with correct response as the dependent measure, with VERB TYPE (transitive, intransitive) and STUDY (study 1a: background change, study 1b: no background change) as fixed effects, and SUBJECT as a random effect.We find a main effect of VERB TYPE [X 2 (1)=2.79,p<0.001], supporting the conclusion that across the two studies children are more adult-like in mapping transitive verbs than intransitive verbs.We also find a main effect of study [X 2 (1)=4.31,p=0.04], indicating that children are more accurate overall in study 1b, where there is no background change.Pairwise comparisons reveal that this main effect is driven by a difference in accuracy in the intransitive conditions (Table 8): children are more accurate at mapping intransitive structures to the one-person event when there is not a background change.8. Probability accuracy and pairwise comparisons for studies 1a-b 5. Discussion.We find over two forced-choice tasks based on Naigles 1990, that 4-7 year-old children strongly link transitive structures to two-person events, but have a much weaker link between intransitive structures and one-person events.We do, however, find significantly higher likelihood of mapping intransitive structures to one-person events in study 1b, where there is no background lure.This suggests that although the background lure is not the only source of the discrepancy in accuracy between transitive and intransitive mappings, that it did contribute to the asymmetry in study 1a.We do not see the same effect of background saliency reported by Benjamin & Harrigan for children's mapping of novel motion verbs.When background competes with syntax, its impact is equivalent across various levels of saliency: background changes are consistently overridden when mapping transitive structures, and consistently disruptive in mapping intransitive structures.This work serves to expand our understanding of the role of various cues in children's verb mapping.It indicates that the connection between intransitive structures and one-person events is more fragile than the relationship between transitive structures and two-person events, and supports the hypothesis that background changes are considered by the learner as potentially relevant to verb meaning (Smyder & Harrigan 2021, Benjamin & Harrigan 2023, Harrigan et al submitted).
5.1.REMAINING QUESTIONS.While this work contributes to our understanding of children's use of syntactic structure and event location for mapping verb meanings, several questions remain unanswered.The first concerns the discrepancy between children's interpretation of transitive and intransitive structures.Children robustly connect verbs with two arguments to two-person events, with or without a change of location.For intransitive structures, however, the connection between arguments and participants is weaker: they are much less accurate even without a location change, and more likely to be disrupted further when there is a background change.There are several possible explanations for these findings.The bias toward choosing the two-person event may be driven by a kind of pragmatic reasoning.The intransitive sentences that we used had two agents in the subject position, joined by conjunction (9).This may have lured children to choose the image where the two participants interacted.
Additionally, at times an intransitive structure can refer to an event in which two participants interact.For example, either sentence (10) or ( 11) could easily refer to the image in Figure 9 -Kate is the agent and Monica the patient of the hugging, although we could also describe this as Kate and Monica hugging.
Figure 9. Hugging event The second remaining question concerns the background change manipulation.Benjamin & Harrigan (2023) find that children are more likely to be lured by events taking place in more salient locations, such as underwater or suspended in midair.Although we use the same backgrounds utilized in that study, we find no differential impact for background of varying levels of saliency.Why should kids be differentially impacted by background salience for motion verbs but not for transitive/intransitive verbs?Are motion verbs more likely to encode location?Further investigation of verb inventory in English and cross-linguistically is needed.

Conclusion.
The current studies investigate cues to verb meaning: we test the ability of 4-7 year old English-learning children to map novel verbs in the face of location changes.In this study, we teach children novel verbs, pitting number of syntactic arguments against the location of the event, manipulating the saliency of the event location.We find that children prefer twoperson events overall: they overwhelmingly choose the two-person event when they have heard transitive structure, but also choose the two-person events at a high rate even when they have heard an intransitive structure.We do find an impact of location change, however: the likelihood of mapping an intransitive structure to a two-person event is higher when the event location has changed.This supports work from several other studies (Smyder & Harrigan 2021, Benjamin & Harrigan 2023, Harrigan et al submitted) suggesting that children encode event location as potentially relevant to verb meaning.Unlike Benjamin & Harrigan (2023), however, the current study does not find saliency of background to have an impact: children are equally impacted (or not) by any location change.The current work leaves open several questions about how the role of background differs in mapping transitive/intransitive verbs vs. motion verbs.These studies contribute to our understanding of how children weigh language cues, like the number of syntactic arguments a verb takes, against language-external cues like event location.

Figure 2 .
Figure 2. Sample test events.Participants had two transitive and two intransitive trials of each BACKGROUND SALIENCY level, one for each of the background images.The design is illustrated inTable 2 below.

Figure 4 .
Figure 4. Proportion accurate responses for VERB TYPE conditions by BACKGROUND SALIENCY.
Figure 8. Proportion accurate responses for VERB TYPE conditions in studies 1a-b 4.6.STATISTICAL ANALYSIS.Statistical analyses clarify the findings.The results were again analyzed using a generalized linear mixed effects model.We ran a mixed-effect logit model with

Table 1 .
Table 1 for background images by saliency condition.Backgrounds by saliency category

Table 3 .
Table 2 below.Proportion accurate responses for VERB TYPE conditions Figure 3. Proportion accurate responses for VERB TYPE conditions 3.4.RESULTS.Children's responses were coded by the experimenter as they were delivered by the child.We find that children are more accurate at identifying the two-person event in the transitive condition than they are at identifying the one-person event in the intransitive condition (Table3, Figure3).

Table 4 .
Proportion accurate responses for VERB TYPE conditions by BACKGROUND SALIENCY

Table 6 .
Proportion accurate responses for VERB TYPE conditions Figure 7. Proportion accurate responses for VERB TYPE conditions