A phonotactic-tonotactic grammar for Tokyo Japanese that clusters by lexical strata offers a good trade-off between model size and likelihood

Satoru Ozaki

doi:10.3765/plsa.v9i1.5725

A phonotactic-tonotactic grammar for Tokyo Japanese that clusters by lexical strata offers a good trade-off between model size and likelihood

Authors

Satoru Ozaki University of Massachusetts Amherst

DOI:

https://doi.org/10.3765/plsa.v9i1.5725

Keywords:

phonology, phonotactics, tonotactics, accent, lexical strata, model selection, MaxEnt, Japanese, Maximum Entropy

Abstract

The Japanese lexicon is typically classified into at least three etymological strata: native, Sino-Japanese and foreign words. In Tokyo Japanese, nouns from different strata are known to have different phonotactic as well as tonotactic properties. Should one analyze Tokyo Japanese nouns using a non-clustering grammar that generates all nouns using the same phonological grammar, or should one analyze them using a clustering grammar that generates nouns from different strata using different grammars? In this study, I address this question from a probabilistic and a model selection perspective: the better probabilistic grammar is one that better balances fit to data and the number of parameters in the grammar. Using the UCLA Phonotactic Learner, I train two kinds of MaxEnt grammars that correspond to non-clustering and clustering grammars. I compare the two kinds of grammar using the Bayesian Information Crierion (BIC), and show that the non-clustering grammars make a better trade-off between fit to data and model size than non-clustering grammars. Consequently, different etymological strata of the Tokyo Japanese nominal lexicon are better analyzed as being generated from different MaxEnt grammars than from the same MaxEnt grammar.

Downloads

Published

2024-05-15

Issue

Vol. 9 No. 1 (2024): Proceedings of the Linguistic Society of America

Section

Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Published by the LSA with permission of the author(s) under a CC BY 4.0 license.

How to Cite

Ozaki, Satoru. 2024. “A Phonotactic-Tonotactic Grammar for Tokyo Japanese That Clusters by Lexical Strata Offers a Good Trade-off Between Model Size and Likelihood”. Proceedings of the Linguistic Society of America 9 (1): 5725. https://doi.org/10.3765/plsa.v9i1.5725.

Download Citation