Political Bias in LLMs: Unaligned Moral Values in Agent-centric Simulations
DOI:
https://doi.org/10.21248/jlcl.38.2025.289Keywords:
agent simulation, ai alignment, ideological biasAbstract
Contemporary research in social sciences increasingly utilizes state-of-the-art generative language models to annotate or generate content. While these models achieve benchmarkleading performance on common language tasks, their application to novel out-of domain tasks remains insufficiently explored. To address this gap, we investigate how personalized language models align with human responses on the Moral Foundation Theory Questionnaire. We adapt open-source generative language models to different political personas and repeatedly survey these models to generate synthetic data sets where model-persona combinations define our sub-populations. Our analysis reveals that models produce inconsistent results across multiple repetitions, yielding high response variance. Furthermore, the alignment between synthetic data and corresponding human data from psychological studies shows a weak correlation, with conservative persona-prompted models particularly failing to align with actual conservative populations. These results suggest that language models struggle to coherently represent ideologies through in-context prompting due to their alignment process. Thus, using language models to simulate social interactions requires measurable improvements in in-context optimization or parameter manipulation to align with psychological and sociological stereotypes properly.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Simon Münker

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.