Glides are high vowels in onsets and codas: A case of Southern Vietnamese

Glides, /j/ and /w/, are recognized as semivowels in Vietnamese, and they are conventionally and widely written as /i̯ / and /u̯ /, respectively (Phạm & Mcleod 2016, Tran, Vallée & Granjon 2019). The implication of the conventional transcription of the two semivowels is that they are non-syllabic /i/ and /u/, suggesting that /i̯ / and /u̯ / might have similar or nearly identical acoustic properties with the two high vowels /i/ and /u/. As the semivowels are only allowed in onsets and codas, and the vowels are allowed in nuclei (Kirby 2011, Phạm & Mcleod 2016, Brunelle 2017, Tran et al. 2019), the only difference between the two semivowels and the two vowels might be where they can appear. That is, the semivowels might be nonsyllabic positional variants of the high vowels that appear in onsets and codas (Levi, 2008). To examine this hypothesis, audio files in which a native female speaker of Southern Vietnamese recites the 200 Swadesh words in her dialect were collected, and the first formant (F1) and the second formant (F2) of each word that contains either /i/, /u/, /i̯ / or /u̯ / were measured using Praat (Boersma & Weenink 2020). Analyses on the F1 showed no significant difference on the F1 between /i/ and /i̯ / (p = .97) or /u/ and /u̯ / (p = .78). Similarly, there was no significant difference on the F2 between /i/ and /i̯ / (p = .91) or /u/ and /u̯ / (p = .91). The results support that /i/ and /i̯ / as well as /u/ and /u̯ / have nearly identical phonetic properties.


Introduction.
Vietnamese is a Mon-Khmer language in the Austroasiatic family that is classified as a tonal language (Phạm & Mcleod 2016, Tran et al. 2019). There are four main regional varieties in Vietnamese: Standard, Northern, Central, and Southern (Phạm & Mcleod 2016). A Vietnamese syllable requires the onset, nucleus, and tone, and the coda is optional (Brunelle 2017, Tran et al. 2019). Vietnamese has two glides /j/ and /w/, and they are recognized as semivowels /i̯ / and /u̯ /, respectively (Phạm & Mcleod 2016, Tran et al. 2019. As implied by the fact that glides are considered as semivowels, semivowels /i̯ / and /u̯ / might have identical or very similar phonetic properties with the vowels /i/ and /u/, respectively. Thus, this paper addresses the question of whether the two semivowels and the two vowels are acoustically similar by analyzing data collected from a female native speaker of Southern Vietnamese. In other words, this paper explores the perspective that semivowels are actually vowels that appear in onset and coda.

Literature review.
2.1. SYLLABLES. The structure of a Vietnamese syllable is (C1)(w)V(C2/w)T, in which (C1) represents the initial consonant singleton, (w) represents the medial semivowel, V represents the main vowel, (C2/w) represents the final consonant or semivowel, and T represents the tone (Phạm & Mcleod 2016). In this paper, the terms onset, nucleus, and coda are used; the initial consonant singleton and the medial semivowel are considered as onset, the main vowel as nucleus, and the final consonant and the final semivowel as coda. In the five components of Vietnamese syllables, only the tone and the nucleus are obligatory, and the other components (i.e., onset and coda) are optional (Phạm & Mcleod 2016). For instance, the word ở /ə̌/ 'at' has the smallest syllable structure possible as the word consists of the nucleus and the tone. Although some researchers take the same perspective as Phạm and Mcleod (2016), Ðoán (as cited in Phạm & Mcleod 2016) suggests that every syllable has all five components because glottal stop /ʔ/ appears when onset is absent, and the absence of coda (i.e., medial and final consonants or semivowels) is considered to be "zero phonemes". This paper adopts the Ðoán's perspective that every Vietnamese word has onset (e.g., the word ở 'at' is considered to be pronounced as /ʔə̌/). However, this paper does not accept Ðoán's claim that the absence of coda is considered to be "zero phonemes" because: (i) it seems unreasonable that only the coda allows "zero phonemes" and the initial consonant does not, and (ii) it is more rational to hypothesize that the onset, nucleus, and tone are obligatory and the coda is optional in Vietnamese, and this is consistent with the claim in a way that the (w) and the coda are optional (Brunelle 2017, Tran et al. 2019).
2.2. VOWELS. A nucleus is obligatory in a syllable in all human spoken languages (e.g., Hayes 2009, Carlisle 2001, and this is also true in Vietnamese (Phạm & Mcleod 2016). In Southern Vietnamese, there are a total of 15 vowels including nine long singleton vowels /i/, /e/, /ɤ/, /ɛ/, /ɯ/, /u/, /o/, /ɔ/ and /a/, three short singleton vowels /ɤ̆/, /ă/ and /ɔ̆/, and three diphthongs /ie/, /uo/ and /ɯɤ/ (Phạm & Mcleod 2016). In this paper, the diphthongs /ie/ and /uo/ are treated as /i̯ e/ and /u̯ o/ because: (i) there is a possibility that these two diphthongs (i.e., VV) are in fact the combinations of a semivowel and a vowel (i.e., wV); (ii) no combination like wVV (e.g., /i̯ ie/ and /u̯ ie/) were found in the data; and (iii) all the diphthongs (except for /ɯɤ/) start with either /i/ or /u/, which are thought to have similar phonetic properties with the semivowels. Although these might also be true for the diphthong /ɯɤ/, which are unrounded version of /uo/, I did not treat it as wV because previous research did not treat /ɰ/ as semivowel.
2.3. CONSONANTS. In Southern Vietnamese, consonants are categorized into two types based on where in a syllable they appear: onsets and codas (e.g., Kirby 2011, Phạm & Mcleod 2016. There are 21 pulmonic consonants and two non-pulmonic voiced implosives, a total of 23 consonants that are allowed in onsets in Southern Vietnamese, although the number of consonants available in onsets reported ranges from 19 (Kirby 2011) to 23 (Phạm & Mcleod 2016). This paper adapts Phạm and Mcleod (2016)'s claim except that: (i) /p/ is excluded as some researchers do not accept /p/, /ʔ/ and /r/, and Southern Vietnamese uses the voiced plosive instead of the unvoiced one (Phạm & Mcleod 2016); (ii) pulmonic voiced bilabial and alveolar plosives (i.e., /b/ and /d/) are considered as non-pulmonic voiced bilabial and alveolar implosives (i.e., /ɓ/ and /ɗ/) because the voiced plosives are canonically realized as implosives (Kirby 2011(Kirby , 2018, and the speaker used the implosives rather than the plosives; and (iii) /ɾ/ is included in accordance with how the participant speaks (see Table 1).
2.4. SEMIVOWELS. In Vietnamese, glides are recognized as semivowels, and they appear in either the medial position of a syllable (i.e., onset) or the syllable-final position (i.e., coda; Phạm & Mcleod 2016). Ðoàn (as cited in Phạm & Mcleod 2016) states that the transcription conventions for the two semivowels employ either /w/ and /j/ or /u̯ / and /i̯ /, and the latter pair is used for the semivowels throughout the paper to be consistent with the hypothesis that the semivowels and the two high vowels /u/ and /i/ are acoustically the same, but the semivowels are non-syllabic as opposed to the vowels being syllabic. As suggested by the transcription convention, semivowels, or what Levi (2008) terms "derived glides", might be non-syllabic positional variants of high vowels that appear in onsets and codas. Thus, this paper examines the perspective that semivowels /u̯ / and /i̯ / might actually be/u/ and /i/ that appear in onset and coda via acoustic analysis.

SPEAKER AND RECORDING.
A thirty-one-year-old female native speaker of Southern Vietnamese who was born and raised in Ho Chi Minh City, in the south part of Vietnam, was recruited from California State University, Fresno. She had lived in the United States for nine months and was a graduate student at the university at the time of the elicitation. She was asked to record herself reciting the 200 Swadesh words in her dialect. She did not receive any type of reward for participating in the study.

DATE ANALYSIS.
To examine whether the quality of Vietnamese semivowels /i̯ / and /u̯ / differ from that of two vowels /i/ and /u/, the first formant (F1) and the second formant (F2) as well as the duration of 82 words from the 200 Swadesh word list that contain either the two semivowels or the two high vowels were measured using Praat (Boersma & Weenink 2020).
The acoustic measures were modeled using a linear mixed-effects model (LME), with fixed effects for the target phonemes (i.e., /i̯ /, /u̯ /, /i/, and /u/) and random intercepts for words. Additional multiple comparisons with the Bonferroni correction were performed when a reliable effect of the target phonemes was found. All the analyses were conducted in R (R Core Team, 2020) using the lme4 package (Bates et al., 2015) and the emmeans package (Lenth 2020). The data were visualized using the ggplot2 package (Wickham, 2016). Figure 1. An LME on the F1 found a reliable effect of the vowels and the semivowels ( However, no significant difference on the F2 was found between /i/ and /i̯ / or /u/ and /u̯ / (see Figure 2).   second formant between the semivowels /i̯ / and /u̯ / and the high vowels /i/ and /u/, respectively. Further, the duration of nuclei was the longest, followed by codas and onsets, which is consistent with the claim that nuclei and coda have morae while onsets do not. Thus, the two high vowels may appear anywhere in a syllable, and they are called vowels when they are in nuclei and called semivowels when in onsets and codas. Based on the finding, it might be better to treat semivowels as vowels instead of consonants and remove /j/ and /w/ from consonant charts for Southern Vietnamese, since they are in complementary distribution with the high vowels /i/ and /u/. This should not be the other way around (i.e., removing high vowels from vowel charts and keeping glides in the consonant chart) since high vowels are near universal, but glides are not (e.g., Rotokas, Tigak). However, this question needs to be examined in many other languages to determine and generalize the view that semivowels should be treated as vowels in order to reflect their similarities.

Results. The distribution of the F1 and F2 is shown in
Before concluding, it is worth mentioning the limitations of this study. The sample size of the data analyzed in the study was small: there was only one native speaker of Southern Vietnamese reciting 200 words. As a result, it is difficult to generalize the results of the study because it is unclear whether the patterns observed in the speech data stem from individual differences or not. Moreover, the speaker was fluent in another language (i.e., English) besides her native language, and it is possible that her fluency in other English affects how she speaks her dialect. Hence, this study requires a larger number of speakers to examine the idea that the glides /w/ and /j/ and the high short vowels /u/ and /i/ are acoustically nearly identical, because the small sample size of this study did not allow to determine if the tendency observed in the data is merely due to individual differences.
In summary, this study has demonstrated that the glides /j/ and /w/, which are considered as the semivowels /i̯ / and /u̯ / in Vietnamese, have nearly identical acoustic properties with the high vowels /i/ and /u/, with the duration of the vowels being longer than that of the semivowels. This suggests that the semivowels are non-syllabic positional variants of high vowels that appear in onsets and codas.