Regional differences (or lack thereof) in rendaku in Japanese surnames

. No study has thoroughly investigated the regional differences of Japanese compound voicing known as rendaku . The present study addresses this issue by conducting a large-scale web-based survey with 492 Japanese speakers as participants and 1,776 compound surnames as stimuli. The results show no clear effects of dialects on rendaku application. This raises a novel theoretical issue for further investigation: Even though pitch accent has been argued to be inversely associated with rendaku (e

1. Introduction.Compound formation in Japanese often involves a voicing alternation known as rendaku.As shown in (1) below, when rendaku applies, the initial consonant of the second element becomes voiced, possibly with additional changes in the place and manner.
(1) Rendaku application in compounds a. maki b. oRi c. ike 'rolled-sushi' 'folding-paper' 'enlivening-flower' Rendaku is a variable rule and does not occur in every compound.Words composed of similar elements may or may not show voicing, as can be seen in the names of the two Japanese scripts: [hira-gana] 'plain-character ' and [kata-kana] 'fragmented-character'.Researchers have proposed a number of factors, including lexical idiosyncrasy, in order to account for this irregular morphophonological process.
Even though rendaku is one of the most well-studied phenomena in Japanese, not much is known about its regional differences.Only a few studies have directly addressed the issue; van Bokhorst (2018) and Takemura et al. (2019) independently investigated the rendaku patterns of compound place names across Japan, both showing that, although some minor differences are found, there are no systematic dialectal effects on the way compound voicing occurs. 1 However, several methodological and other challenges, which are listed in (2), make it difficult to take these results at face value.
(2) Problems of using place names for dialectal research a. Non-uniform sampling b.Effects of non-linguistic factors c.Possible differences from local pronunciations Place name data may be skewed due to non-uniform sampling.Different regions have distinct place names with idiosyncratic rendaku behaviors.For example, voicing is found in Yodogawa [jodo-gawa] 'stagnant-river' in Osaka but not in Arakawa [aRa-kawa] 'rough-river' in Tokyo.It is unclear whether this really reflects dialectal differences with respect to rendaku application.The distribution of place names may also be affected by non-linguistic factors.Recently, so-called "municipal amalgamations" resulted in massive mergers of towns and cities.Many place names were thus lost without any obvious linguistic reasons.Lastly, place names registered in the official postal code directory, on which the previous studies are based, can be different from local pronunciations.For instance, Kobuchisawa [ko-butCi-sawa] 'smallabyss-stream', a town in Yamanashi, is registered with the non-rendaku reading; yet some local residents say [ko-butCi-zawa] with voicing, which is allegedly an original pronunciation.This suggests that the postal code directory may be unsuitable for dialectal research.
In order to examine the dialectal differences of rendaku while avoiding the problems above, the present study employs compound surnames as its main data.As in regular compound words and place names, rendaku may apply in compound surnames.Some surnames show rendaku while others do not, and there are also some that variably undergo voicing, as shown in (3).Note that surnames are usually written in kanji (Chinese characters) and rendaku application is not reflected in the writing.

岡田 坂田 中田
I conduct an on-line survey with Japanese speakers from various regions as participants and common compound surnames as experimental stimuli.This methodology has several advantages.It guarantees quasi-uniform sampling and takes actual regional pronunciations into account, since it collects data of rendaku judgment about the same or a similar set of surnames from a large number of speakers.2Additionally, the sound patterns of surnames have some independent theoretical issues; conducting research on surnames can further our understanding of Japanese phonology, as will be discussed in more detail in Section 3.  (Becker & Levine 2013).At each trial, participants were presented with a compound surname written in kanji (e.g., 山崎 'mountain-cape'), which came with the honorific suffix -san in phonographic hiragana and was embedded in a frame sentence ("There is a person called 山崎-san.").They were also given two numbered options: one with the rendaku reading (e.g., 1. Yamazaki) and the other with the non-rendaku reading (e.g., 2. Yamasaki) of the surname written in hiragana.They were asked to read both of them out loud, and to choose the one that would sound more natural by clicking on a button with the corresponding number.They completed this task 120 times with surnames randomly selected from the stimulus pool.At the end of the experiment, they also voluntarily filled out a questionnaire about their age and home prefecture as well as their knowledge of linguistics.

Experiment
2.1.3.PARTICIPANTS.Through a crowdsourcing service (CrowdWorks), 500 native speakers of Japanese were recruited for a reward of 200 Japanese Yen.As it turned out, eight participants did not provide their home prefecture, and thus their data were discarded.The remaining 492 speakers (mean age: 39.22; SD: 10.59) were further grouped by 10 dialect-based regions, roughly following the dialect divisions proposed by previous studies (see Tojo 1954;Kato 1977) A mixed-effects logistic regression model was constructed using the glmer function of the lme4 and lmerTest packages (Bates et al. 2015;Kuznetsova et al. 2017) on the R software (R Core Team 2021) with Rendaku (applied or not) as its response variable, Region (10 regions) as its predictor, and by-item (first and second elements) and by-participant random intercepts as its random structure.The results of the analysis are summarized in Table 2.The baseline intercept here corresponds to Region: Tokyo/Kanto.As can be seen, the model indicates no statistically significant effects of Region on Rendaku.Although Osaka/Kansai shows the largest effect (z = 1.854), it is still not significant at the alpha level of 0.05 (p = 0.064).In other words, rendaku application rates do not differ greatly across dialects.2019), the data were further analyzed by second elements ("E2").Figure 2 (on the previous page) plots the average rendaku application rates by region for eight common E2 morphemes.The average rendaku rates appear to differ by E2; that is, some morphemes are more likely to undergo rendaku (e.g., 崎 /saki/ 'cape') than others (e.g., 谷 /tani/ 'valley'), as has been reported by previous studies on surnames (Sugito 1965;Zamma & Asai 2017).Within each E2, however, differences by region look rather small.When the differences are present, they are mostly due to the skew in datapoints; some regions have relatively few speakers/responses when the data are broken down by second-element type.

RENDAKU RATES BY INDIVIDUAL SURNAME.
A closer look at the results suggests that the variability of rendaku in certain surnames may actually be attributable to regional differences.For example, for the surname 山崎 /jama-saki/ 'mountain-cape', the average rendaku rate is 100% in Tokyo/Kanto, while it is 62.5% in Osaka/Kansai (see Iwasaki 2013:42 for an anecdotal remark on this particular surname).To examine the trend in such surnames with a large regional difference, I extracted those that differ in rendaku application rates by more than 25 percentage points between Tokyo/Kanto and Osaka/Kansai.Here, I only focus on the differences between the two regions with the largest population in the country (hence, the largest numbers of participants).3. Surnames with a large regional difference in rendaku rates The rendaku patterns of these surnames indicate that there is no consistent directionality in the regional differences.In some surnames, the rendaku rates are higher in Tokyo/Kanto than in Osaka/Kansai, but they are the other way around in others.It should also be noted that these surnames are not large in number.In all, 61 of the 1,176 surnames (5.19%) show differences greater than 25 percentage points.Lastly, surnames that are known for showing variable rendaku behaviors do not necessarily show a large regional difference.For example, the rendaku rates for 中田 /naka-ta/ 'central-paddy' are 26.32% in Tokyo/Kanto and 21.43% in Osaka/Kansai; for 中嶋/中島 /naka-sima/ 'central-island', the rates are 84.82% in Tokyo/Kanto and 92.10% in Osaka/Kansai.2.3.SUMMARY.To summarize, no clear regional effects are found on the way rendaku is applied in compound surnames.This is also true when the results are analyzed by common E2 morphemes.Although some individual surnames may show differences that can be attributed to dialects, the trends are not consistent and such surnames are not large in number.Thus, the study mostly replicated the findings of previous studies of rendaku in compound place names (van Bokhorst 2018; Takemura et al. 2019) with a different set of data (surnames) and systematic experimental methods.
3. Discussion.In this section, I discuss the inverse association between rendaku and accent in relation to the results shown in the previous section.I also propose a tentative analysis of the phenomenon.
3.1.RENDAKU-ACCENT ASSOCIATION.The experimental results above, though nonsignificant, further our understanding of Japanese rendaku.As has been stated, there are no systematic differences of rendaku application caused by regions or dialects as far as compound surnames are concerned.This actually raises a novel theoretical issue with respect to the relationship between voicing and accentuation.
It has been documented that rendaku application and accentedness are inversely associated.In other words, rendaku voicing and pitch accent tend not to co-occur; compounds with rendaku are likely to be unaccented, while those without are often accented.This is exemplified in ( 4) where surnames that variably undergo rendaku also show different accentuation patterns depending on the occurrence of compound voicing.Here, the lack of an accent mark indicates that the word is unaccented.Sugito (1965) first revealed this inverse association by examining the patterns of accentuation as well as rendaku application in compound surnames with /ta/ 'paddy' as E2 (e.g., [sáka-ta] 'slope-paddy' vs. [jama-da] 'mountain-paddy').Later studies, such as Zamma (2005), Ohta (2013), andZamma &Asai (2017), have shown that the generalization holds true with other kinds of surnames, even though the degree of association may be weaker depending on the E2 morpheme and there are not a few exceptions.The phenomenon has thus been welldocumented; but it remains one of the unresolved issues in formal Japanese phonology. 6he question now arises as to why surnames show similar rendaku patterns across dialects.Japanese dialects are known for showing various accentuation patterns (see e.g., Haraguchi 1977;Shibatani 1990;Uwano 1999;Kubozono 2012, 2015 andreferences therein).If rendaku voicing and pitch accent are strongly associated with each other, dialects with different accent patterns could also show great variability in rendaku application.

A POSSIBLE ACCOUNT.
As a tentative explanation, I propose that it is foot structure, not pitch accent itself, that has a strong relationship with rendaku voicing.Specifically, I argue that application of rendaku requires two consecutive feet and that this foot structure entails unaccentedness in Tokyo Japanese, but not necessarily in other dialects.Ito & Mester (2016) have proposed a foot-based account of accentuation in Tokyo Japanese, including the emergence of unaccentedness.Their analysis is illustrated in (5) below with loanwords having antepenultimate accent as well as those that are unaccented.Assumed feet are indicated by parentheses.
(5) Antepenultimacy and unaccentedness a. (kána) da b. (ame) (Rika) 'Canada' 'America' Three-mora words often receive antepenultimate accent; they have a bimoraic trochaic foot and a final extrametrical syllable, as shown in (5a), which is commonly assumed for deriving antepenultimacy in many languages.By contrast, four-mora words are often unaccented; they are exhaustively footed with two bimoraic feet, as shown in (5b).Given this foot structure, unaccenteness arises from a tension between two well-known metrical constraints: NONFINALITY(FT'), which bans a head foot from occurring at the right edge of a prosodic word, and RIGHTMOST, which requires a head foot to be the rightmost foot.Note that if a word is exhaustively footed into two consecutive feet, as in "(σσ)(σσ)," placing an accent on either foot would violate one of the above constraints.If the two constraints outrank WORDACCENT, which requires a prosodic word to have a prominence peak, the conflict is resolved by rendering the word unaccented.
While adopting this basic analysis for compound surnames, I posit that the constraint WORDACCENT exerts a stronger force on them, given the observation that proper names in general are more likely to be accented than common nouns (see Tanaka, S.'i. & Kubozono 1999;Tanaka, Y. & Sugawara 2018;Tanaka, Y. 2023).I also assume that each element of a compound surname does not need to have its own foot and that a foot can even span across a morpheme boundary (see Alderete 2015), unlike in the case of a common noun compound (Kubozono 1995(Kubozono , 1997)); this is because proper name compounds lack semantic compositionality and behave more like simplex words (see Tanaka Y. 2017a;2023 for discussion).This ensures that surnames, whether they are three or four moras in length, receive antepenultimate accent by default, as in [(sáka)-ta] 'slope-paddy' and [na(ká-Ci)ma] 'mountain-cape'. 7hy, then, do surnames become unaccented when they undergo rendaku?In order to account for this incompatibility of accent and voicing, I further propose a constraint that requires rendaku, which is a feature-sized linking morpheme (Ito & Mester 2003), to be realized only between stems projecting their own feet (see Rosen 2003 for a similar idea).The constraint is grounded in the raison d'être of rendaku: Even though a compound surname may usually behave like a simplex word (see above), it needs to be like a real compound word having two full-fledged stems with their own feet to receive compound voicing.With the high ranking of this constraint, rendaku can be realized only in surnames with two consecutive feet, which in turn results in unaccentedness, as in [(jama)-( da (jama)-(da) (naka)-(ýima) 'mountain-paddy' 'centeral-island' Returning to the lack of dialectal differences in rendaku, I speculate that surnames have similar foot structures across dialects but do not necessarily show identical accent patterns.This is illustrated in (7) below with examples of accentuation contrasting Tokyo/Kanto Japanese and Osaka/Kansai Japanese.For clarity, the tonal melodies of words are also given, with L representing a low tone and H a high tone.A three-mora surname without rendaku often has antepenultimate accent in Tokyo/Kanto Japanese; again, this can be analyzed by positing a left-aligned bimoraic foot and an extrametrical syllable following Ito & Mester (2016), as shown in (7a).On the right, the same surname has penultimate accent in Osaka/Kansai Japanese, but it is nonetheless analyzed as having a single foot: penultimacy can be derived from a left-aligned iambic foot or a right-aligned trochaic foot.
As shown in (7b), a surname with rendaku voicing can be unaccented in both dialects, which is analyzed as being exhaustively footed with two bimoraic feet.The melody is not the same (LHH vs. LLH), but this is most probably due to a difference in tonal assignment.What matters here is that the two dialects have the same foot structure: rendaku application requires two consecutive feet, which leads to unaccentedness.
(7c) shows that a surname with rendaku may possibly have different accentuation in the two dialects in question: unaccentedness and penultimate accent.Note that both accent patterns can be derived from the same foot structure.Unaccentedness straightforwardly arises from two feet in a row; with high-ranked NONFINALITY(FT') and RIGHTMOST, neither foot can hold accent (Ito & Mester 2016).Exhaustive footing is also compatible with penultimacy.On the assumption that WORDACCENT is ranked higher as dialectal variation, an iambic foot on the left results in penultimate accent, as in [(imá)-(da)].With trochaic footing, one can still derive antepenultimacy by assuming that a monomoraic foot and a boundary-spanning bimoraic foot, as in [(i)(má-da)], meet the requirement for rendaku application.Additionally, some Osaka/Kansai speakers are reported to pronounce the same surname as unaccented, as in [(ima)-(da)] (Sugito 1965); again, unaccentedness can easily be derived from consecutive feet.
The analysis proposed here is, of course, not comprehensive and its validity should be tested with more data of accentuation in surnames across different dialects.I leave this for future research.

Conclusion.
To conclude, a large-scale judgment experiment has found no clear regional effects on rendaku application in compound surnames.Although some individual surnames may show different rendaku profiles across dialects, the variability does not seem to be derived in any systematic manner.This raises the question of why rendaku voicing is inversely associated with pitch accent.I have proposed a tentative account based on foot structure.Future studies should investigate not only the rendaku patterns but also the accent patterns of surnames across Japanese dialects to test the validity of the proposed account.
Table 3 shows a subset of those surnames.