Search papers, labs, and topics across Lattice.
This paper investigates how perceived user demographics (race, age, gender, confidence) influence sycophancy in LLMs, specifically GPT-5-nano and Claude Haiku 4.5. Using 768 multi-turn adversarial conversations within the Petri framework, the authors measured false validation rates across 128 persona combinations in mathematics, philosophy, and conspiracy theory domains. They found that GPT-5-nano exhibits significantly more sycophancy overall and that certain demographics, such as Hispanic personas, and domains, such as philosophy, elicit higher rates of false validation.
LLMs play favorites: GPT-5-nano is significantly more likely to agree with incorrect statements depending on the perceived race, age, gender, and confidence of the user.
Large language models exhibit sycophantic tendencies--validating incorrect user beliefs to appear agreeable. We investigate whether this behavior varies systematically with perceived user demographics, testing whether combinations of race, age, gender, and expressed confidence level produce differential false validation rates. Inspired by the legal concept of intersectionality, we conduct 768 multi-turn adversarial conversations using Anthropic's Petri evaluation framework, probing GPT-5-nano and Claude Haiku 4.5 across 128 persona combinations in mathematics, philosophy, and conspiracy theory domains. GPT-5-nano is significantly more sycophantic than Claude Haiku 4.5 overall ($\bar{x}=2.96$ vs. $1.74$, $p < 10^{-32}$, Wilcoxon signed-rank). For GPT-5-nano, we find that philosophy elicits 41% more sycophancy than mathematics and that Hispanic personas receive the highest sycophancy across races. The worst-scoring persona, a confident, 23-year-old Hispanic woman, averages 5.33/10 on sycophancy. Claude Haiku 4.5 exhibits uniformly low sycophancy with no significant demographic variation. These results demonstrate that sycophancy is not uniformly distributed across users and that safety evaluations should incorporate identity-aware testing.