Lexical Strata and Phonotactic Perplexity Minimization

Eric Robert Rosen


We present a model of gradient phonotactics that is shown to reduce overall phoneme uncertainty in a language when the phonotactic grammar is modularized in an unsupervised fashion to create more than one sub-grammar. Our model is a recurrent neural network language model (Elman 1990), which, when applied in two separate, randomly initialized modules to a corpus of Japanese words, learns lexical subdivisions that closely correlate with two of the main lexical strata for Japanese (Yamato and Sino-Japanese) proposed by Ito and Mester (1995). We find that the gradient phonotactics learned by the model, which are based on the entire prior context of a phoneme, reveal a continuum of gradient strata membership, similar to the gradient membership proposed by Hayes (2016) for the Native vs. Latinate stratification in English.


lexical strata; Japanese; neural network language model; gradient phonotactics

Full Text:


DOI: https://doi.org/10.3765/amp.v9i0.4918

Copyright (c) 2021 Eric Robert Rosen

License URL: https://creativecommons.org/licenses/by/3.0/