Comparing K-means and OPTICS clustering algorithms for identifying vowel categories

Authors

DOI:

https://doi.org/10.3765/plsa.v8i1.5488

Keywords:

phonetics, vowels, unsupervised clustering, K-means, machine learning, corpus methods

Abstract

The K-means algorithm is the most commonly used clustering method for phonetic vowel description but has some properties that may be sub-optimal for representing phonetic data. This study compares K-means with an alternative algorithm, OPTICS, in two speech styles (lab vs. conversational) in English to test whether OPTICS is a viable alternative to K-means for characterizing vowel spaces. We find that with noisier data, OPTICS identifies clusters that more accurately represent the underlying data. Our results highlight the importance of choosing an algorithm whose assumptions are in line with the phonetic data being considered.

Downloads

Published

2023-04-27

How to Cite

Grabowski, Emily, and Jennifer Kuo. 2023. “Comparing K-Means and OPTICS Clustering Algorithms for Identifying Vowel Categories”. Proceedings of the Linguistic Society of America 8 (1): 5488. https://doi.org/10.3765/plsa.v8i1.5488.