Prosodic disambiguation of w h-indeterminates in Mandarin Chinese

. This study focuses on naturally occurring ambiguous utterances like “Zhōngguóduì shuí yě dǎ-bù-guò” in Mandarin to study if/how prosody is used for disambiguation of wh-indeterminates. The results of our production study suggest that wh-indeterminates are disambiguated prosodically. For the wh-region, interrogative readings are distinguished from indefinite readings by having a longer duration and higher maximum pitch. For the pre -wh region, longer duration was observed when the wh-word received interrogative readings and left-dislocated. For the post -wh region, significantly greater pitch excursion was observed for indefinite reading than for interrogative reading. In particular, the novel finding of post -wh pitch compression for wh-interrogatives in Mandarin is in line with what has been attested in other wh-in-situ languages, such as Japanese and Korean, which suggests shared prosodic mechanisms for disambiguating wh-indeterminates in wh-in-situ languages.

(1) Shuí yě méi qù shàngkè who also not go attend.classa. wh-indefinite: 'No one attended the class.' b. wh-interrogative: 'Who didn't attend the class as well?' Previous studies on wh-indeterminates have divergent conclusions.Although there have been studies suggesting that speakers are more relying on syntactic binding for disambiguation (Shyu and Tung 2018), quite a few studies strongly argue that prosody is the primary mechanism for disambiguating wh-indeterminates (Hu 2002;Yang 2018;Hsu and Xu 2020;Wang and Wang 2020).Among those who argue for prosodical disambiguation, the prominence of wh-region (longer duration and higher pitch) is found to be a property of wh-interrogatives (Yang 2018;Wang and Wang 2020;Hsu and Xu 2020).However, existing studies have divergent findings on how wh-indeterminates are prosodically disambiguated in pre-wh region and post-wh region.Yang (2018) reports a shorter pre-wh region for wh-interrogatives and no duration differences in the post-wh region, based on the stimuli like (2).
(3) (Wang and Wang 2020: 381) a.  2) and (3), stimuli used in previous studies have wh-indeterminates in sentence-initial subject position (Hu 2002, Wang andWang 2020) or sentence-final object position (Wang and Wang 2020) or direct object position (Yang 2018); or have wh-indeterminates co-occurred with unambiguous sentence-final particles (Hsu and Xu 2020).Such designs make it difficult to exclude the potential influence of sentential intonation patterns such as boundary effects and morphological cues to the meaning of wh-indeterminates due to the pseudo minimal pair stimuli.As a result, findings from previous studies are contradictory and inconclusive.
The present study focuses on naturally occurring utterances like (4) to study which mechanisms are at play.
(4) Zhōngguóduì shuí yě dǎ-bù-guò Chinese-team who also beat-not-Complement a. wh-indefinite: (i) 'The Chinese team can't beat anyone.' (ii) 'No one can beat the Chinese team'.b. wh-interrogative: (iii) 'Who is the team that the Chinese team also can't beat?' (iv) 'Who is the team that also can't beat the Chinese team?' Sentences like (4) are often used by native speakers in social media to express (4a-i) when talking about the Chinese male soccer team losing all the games and to express (4a-ii) when talking about the Chinese ping-pong team winning all the games.We investigated if this kind of sentence can induce interrogative readings as in (4a-iii) and (4a-iv), and if/how prosody is used for disambiguation.
2. Methodology and data.We created eight sentences like (4) and controlled the type of whwords (regular wh-words versus D-linked wh-words) across the sentences.We also varied the length of the target sentences by varying the presence and length of the adverbs.The full list of the target sentences can be found in the Supplementary Materials.For each target sentence, we provided four possible readings like the ones in (4).Participants (15 Mandarin native speakers, 8 females, and 7 males, age range: 22-32) were recruited through emails and social media.They voluntarily participated in this experiment.They were asked to record their speech of the target sentences through their own smartphones or laptops in a quiet room and submitted their audio recordings through emails.For each possible reading of a target sentence, participants were asked to say aloud the target sentence or say "I do not think the target sentence can be used to express the given meaning" if they thought so.There was no limitation on the time of recording nor on the number of trials to record the target sentences.Participants could update their recording samples anytime by re-recording a sentence before submission, and we only included their last version of the audio recording in the dataset.One participant's data were excluded because of incomplete recording.12participants used smartphones (Huawei, iphone, or Sumsung) for recordings, and 2 participants used Lenovo laptops.All the devices had decent built-in microphones which helped to create the audio files with good quality.
After receiving the audio files, we converted them into the WAV format using Praat if they were in a different format.For each participant, 32 recorded audio files (8 target sentences × 4 possible readings) were included in the dataset.Each audio file was coded with a sentence label corresponding to the target sentence and specified reading, otherwise "null" if the participant said, "I do not think the target sentence can be used to express the given meaning".

Results.
Four participants consistently rejected all interrogative readings of all target sentences, but the remaining 10 participants accepted the four-way ambiguity of target sentences (like the ones in (4i-4iv)) to some extent, with six of them accepting all four possible readings for all the target sentences.Table 1 shows the distributions of participants' acceptance of the interrogative-indefinite ambiguity in the target sentences.Although not all participants accepted the four-way ambiguity, our results empirically confirm that interrogative-indefinite ambiguity does exist for structures like (4) with identical strings for most of the participants.

Participants
Acceptance of the four possible readings no.1, 9 and 14 rejected all interrogative readings for all target sentences no.8 rejected all interrogative readings and readings like (4a-ii) for all target sentences no.2, 4, 6, 9, 13, 15 no rejections of any of the four readings for all target sentences.no.3 rejected reading of (4b-iv) for most target sentences, and reading of (4b-iii) for some target sentences no.7 rejected reading of (4b-iii) for one target sentence where wh-word is not D-linked and has an adverb before wh-word no.10 rejected both interrogative readings for some target sentences, especially when the wh-word is not D-linked no.11 rejected interrogative readings and reading of (4b-iii) for some target sentences no.12 rejected reading of (4b-iv) for most target sentences and reading of (4b-iii) and of (4a-ii) for some target sentences no.5 N/A (data was excluded because of incomplete recordings)

Table 1. Acceptable readings for participants
For those 10 participants who accepted interrogative readings, we compared the prosodic properties of indefinite and interrogative readings.We first conducted measurements of the lowest and highest pitch heights on the pre-wh region, post-wh region, and the wh-region.We then standardized all collected pitch heights using the Z-score and computed the difference between the lowest and highest pitches, which allowed us to calculate the average of pitch excursion.We also measured the duration of the pre-wh region, post-wh region, and the wh-region for each audio file and calculated the average of durations for each region.A mixed linear regression model was used to compute the inferential statistics to see if the type of wh-words affects the prosodic properties of indefinite and interrogative readings.
Overall, the results suggest that wh-indeterminates are disambiguated prosodically.For the wh-region, interrogative readings are distinguished from indefinite readings by having a longer duration (Figure 1) (p < .05)and higher maximum pitch (Figure 2) (p < .001)2 .The longer duration plus higher maximum pitch on the wh-region reconfirms the prosodic prominence of the wh-interrogatives as previously reported in the literature (Yang 2018;Wang and Wang 2020;Hsu and Xu 2020).For the pre-wh region, longer duration was observed when the wh-word received interrogative readings (p < .05)and left-dislocated (p < .001)(Figure 3).This finding is against Yang (2018) that a shorter duration of the pre-wh region is a reliable cue to signal interrogative reading.The experimental results demonstrate that Mandarin wh-indeterminates are ambiguous and the interrogative reading is prosodically differentiated from the indefinite reading by wh-prominence, post-wh pitch compression, and longer pre-wh duration.The findings of wh-prominence and post-wh pitch compression for wh-interrogatives align with previous studies conducted on wh-in-situ languages like Japanese and Korean (Jun 1993;Ishihara 2002), suggesting shared prosodic mechanisms for disambiguating wh-indeterminates.This raises interesting research questions regarding the typological account for the observed patterns and the underlying reasons behind the shared mechanisms across different languages.
However, our study has certain limitations that we aim to address in future research.Previous studies, including Hirotani (2005) and related work, have noted dialectal variations in Japanese that may impact wh-prosody, a factor that we did not account for in our experiments.Although inter-speaker variation was observed in the acceptance of interrogative-indefinite ambiguity in the target sentences (Table 1), an adequate explanation for this variation and its relationship with different prosodic strategies for disambiguating wh-indeterminates in Mandarin has yet to be found.Another limitation of our study is that the experimental design may have intentionally elicited different prosodic contours for disambiguating wh-indeterminates.As one reviewer noted, speakers tend to produce varying prosodic contours when they are made aware of different interpretations.
To address these limitations, future research will examine inter-speaker variation patterns and dialectical backgrounds more closely to investigate the influence of these factors on whprosody in Mandarin.Additionally, we will implement a block design for future production experiments and conduct perception tests to determine whether the prosodic strategies observed in production facilitate the processing of wh-indeterminates in Mandarin.

Appendix: Target sentences used in the experiment
Target sentences in simplified Chinese characters and pinyin with tones English translation of wh-indefinite readings English translation of wh-interrogative readings 中国队谁也打不过 zhōng guó duì shuí yě dǎ bù guò "The Chinese team can't beat anyone."or "No one can beat the Chinese team" "Which team is the team that the Chinese team can't beat?" or "Which team is the team that also can't beat the Chinese team?" 中国队哪个队也打不过 zhōng guó duì nǎ gè duì yě dǎ bù guò "The Chinese team can't beat anyone."or "No one can beat the Chinese team" "Which team is the team that the Chinese team can't beat?" or "Which team is the team that also can't beat the Chinese team?" 今年世界杯法国队谁也 赢不了 jīn nián shì jiè bēi fǎ guó duì shuí yě yíng bù liǎo "The French team can't beat any team for this year's World Cup." or "No team can beat the French team for this year's World Cup" "Which team is the team that the French team can't beat for this year's World Cup?" or "Which team is the team that also can't beat the French team for this year's World Cup?" 今年世界杯法国队哪个 队也赢不了 jīn nián shì jiè bēi fǎ guó duì nǎ gè duì yě yíng bù liǎo "The French team can't beat any team for this year's World Cup." or "No team can beat the French team for this year's World Cup" "Which team is the team that the French team can't beat for this year's World Cup?" or "Which team is the team that also can't beat the French team for this year's World Cup?" 张路最近谁也看不惯 zhāng lù zuì jìn shuí yě kàn bù guàn "Who is the person that Zhao Xin isn't able to reach either?" or "Who is the person that is not able to reach Zhao Xin either?"

Figure 3 .
Figure 3. Duration of pre-wh regionFor the post-wh region, significantly greater pitch excursion was observed for indefinite reading than interrogative reading (Figure4) (p < .05).This novel finding indicates that compressed F0 pitch range after the wh-word correlates with interrogative readings.

Figure 4 .
Figure 4. Duration of pre-wh region 4. Discussion.The experimental results demonstrate that Mandarin wh-indeterminates are ambiguous and the interrogative reading is prosodically differentiated from the indefinite reading by wh-prominence, post-wh pitch compression, and longer pre-wh duration.The findings of wh-prominence and post-wh pitch compression for wh-interrogatives align with previous studies conducted on wh-in-situ languages like Japanese and Korean(Jun 1993;Ishihara 2002), suggesting shared prosodic mechanisms for disambiguating wh-indeterminates.This raises interesting research questions regarding the typological account for the observed patterns and the underlying reasons behind the shared mechanisms across different languages.However, our study has certain limitations that we aim to address in future research.Previous studies, includingHirotani (2005) and related work, have noted dialectal variations in Japanese that may impact wh-prosody, a factor that we did not account for in our experiments.Although inter-speaker variation was observed in the acceptance of interrogative-indefinite ambiguity in the target sentences (Table1), an adequate explanation for this variation and its relationship with different prosodic strategies for disambiguating wh-indeterminates in Mandarin has yet to be found.Another limitation of our study is that the experimental design may have intentionally elicited different prosodic contours for disambiguating wh-indeterminates.As one reviewer noted, speakers tend to produce varying prosodic contours when they are made aware of different interpretations.To address these limitations, future research will examine inter-speaker variation patterns and dialectical backgrounds more closely to investigate the influence of these factors on whprosody in Mandarin.Additionally, we will implement a block design for future production "Who is the person that Zhang Lu also cannot tolerate?"or "Who is the person that also cannot tolerate Zhang Lu?" 赵新昨天上午谁也联系 不上 zhào xīn zuó tiān shàng wǔ shuí yě lián xì bù shàng "Zhao Xin isn't able to reach anyone" or "No one is able to reach Zhao Xin" "Who is the person that Zhao Xin isn't able to reach either?" or "Who is the person that is not able to reach Zhao Xin either?" 赵新昨天上午哪个同事 也联系不上 zhào xīn zuó tiān shàng wǔ nǎ gè tóng shì yě lián xì bù shàng "Zhao Xin isn't able to reach any colleague" or "No colleague is able to reach Zhao Xin"