Pictorial constituents & the metalinguistic performance of LLMs

Authors

DOI:

https://doi.org/10.21248/jlcl.38.2025.295

Keywords:

emojis, LLM, generative AI, pictorial constituents, depiction, iconicity, grammaticality, acceptability, morphology

Abstract

In this paper I show that, although ChatGPT (GPT-4o) can provide accurate linguistic acceptability judgments for many types of sentences (Cai, Duan, Haslett, Wang, & Pickering, 2024; Collins, 2024a, 2024b; Ortega-Martín et al., 2023; Wang et al., 2023), it does not give accurate grammaticality judgments for sentences that contain pro-text emojis, which are emojis that appear in a written utterance as morphosyntactic constituents (Cohn, Engelen, & Schilperoord, 2019; Pierini, 2021; Storment, 2024; Tieu, Qiu, Puvipalan, & Pasternak, 2025, a.o.). I demonstrate this with three distinct experiments performed on GPT-4o using both English and Spanish data. This work builds on prior research that shows that the combinatorics of pro-text emojis are sensitive to the morphosyntactic constraints of the language in which the emojis appear, and it connects the poor performance of GPT-4o in this respect to two factors: (i) the fact that, while LLMs are able to make some generalizations of syntactic structural dependencies, their mechanisms for making such generalizations are not derived in the same way that human syntactic structures are (Contreras Kallens, Kristensen-McLachlan, & Christiansen, 2023; Hale & Stanojević, 2024; Kennedy, 2025; Linzen & Baroni, 2021; Manova, 2024a, 2024b; Zhong, Ding, Liu, Du, & Tao, 2023, a.o.), and (ii) the fact that LLMs lack the means of directly processing iconic and pictorial content in the same way that human cognition allows for. I also consider the possibility that the relevant data are poorly attested in the model's training parameters. This paper establishes a precedent for the research of the intersection of generative AI and utterances that contain pictorial elements as morphosyntactic constituents.

Downloads

Published

2025-07-08

How to Cite

Storment, J. D. (2025). Pictorial constituents & the metalinguistic performance of LLMs. Journal for Language Technology and Computational Linguistics, 38(2), 111–124. https://doi.org/10.21248/jlcl.38.2025.295