On gender stereotypicality in nouns and adjectives: Comparing humans, large language models and text-to-image generators
DOI:
https://doi.org/10.3765/plsa.v10i1.5954Keywords:
pronoun production, experimental linguistics, sociolinguistics, large language models, artificial intelligence, role nouns, adjectives, text-to-image generationsAbstract
Both humans and large language models (LLMs) are known to exhibit effects of gender stereotypicality. We conducted a series of studies to systematically assess to what extent humans’ and LLMs’ interpretational patterns align, how different kinds of linguistic expressions (role nouns vs. adjectives) contribute, and to what extent these patterns extend to text-to-image models. Experiments 1 and 2 test how gender-biased role nouns (e.g. plumber, nurse) and adjectives (e.g. powerful, kind) influence humans’ and GPT-4o’s assumptions about gender in a fill-in-the-blank task. Experiment 3 tests how role nouns and adjectives influence images created by the image generator DALL-E 3 (a text-to-image model). Our results show that humans, LLMs and text-to-image models’ outputs are all influenced by gender stereotypes but diverge in unexpected ways.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Elsi Kaiser, Ashley Adji

This work is licensed under a Creative Commons Attribution 4.0 International License.
Published by the LSA with permission of the author(s) under a CC BY 4.0 license.