Search papers, labs, and topics across Lattice.
The paper identifies and quantifies "Persona Collapse" in LLMs, where agents with distinct profiles converge to homogeneous behavior in multi-agent simulations. They introduce a framework measuring Coverage, Uniformity, and Complexity to evaluate persona collapse across different LLMs and tasks. Results show persona collapse varies across dimensions (e.g., personality vs. moral reasoning) and that models with higher per-persona fidelity paradoxically produce more stereotyped populations.
LLMs that nail individual personas can still fail spectacularly at generating diverse populations, instead defaulting to coarse stereotypes.
Applications based on large language models (LLMs), such as multi-agent simulations, require population diversity among agents. We identify a pervasive failure mode we term \emph{Persona Collapse}: agents each assigned a distinct profile nonetheless converge into a narrow behavioral mode, producing a homogeneous simulated population. To quantify persona collapse, we propose a framework that measures how much of the persona space a population occupies (Coverage), how evenly agents spread across it (Uniformity), and how rich the resulting behavioral patterns are (Complexity). Evaluating ten LLMs on personality simulation (BFI-44), moral reasoning, and self-introduction, we observe persona collapse along two axes: (1) Dimensions: a model can appear diverse on one axis yet structurally degenerate on another, and (2) Domains: the same model may collapse the most in personality yet be the most diverse in moral reasoning. Furthermore, item-level diagnostics reveal that behavioral variation tracks coarse demographic stereotypes rather than the fine-grained individual differences specified in each persona. Counter-intuitively, \textbf{the models achieving the highest per-persona fidelity consistently produce the most stereotyped populations}. We release our toolkit and data to support population-level evaluation of LLMs.