distributional semantics, polysemy, categorization, natural language processing
Recent work has indicated that static word embeddings can predict human semantic categories (Majewska et al. 2021). In this paper, we consider the role of polysemy in semantic categorization, by comparing sense-level embeddings with previously studied static embeddings in their prediction of human-produced categories. We find that the polysemy is crucial for predicting human categories; sense-level embeddings dramatically outperform static embeddings in predicting semantic categories. Our findings highlight the role of polysemy in semantic categorization that is exclusively based on linguistic input.