Search papers, labs, and topics across Lattice.
The authors introduce Cognitive Digital Shadows (CDS), a 190,000-record synthetic corpus designed to analyze LLM-generated discourse across diverse social contexts. They prompted 19 LLMs to debate controversial societal topics while shadowing human personas defined by 17 sociodemographic and psychological attributes. The resulting dataset allows for detailed analysis of how LLM stances, reasoning, and emotional framing vary based on persona characteristics and topic.
See how LLMs' stances on vaccines, disinformation, and gender equality shift when they "become" different people, thanks to a new dataset of 190,000 persona-driven debates.
Large Language Models (LLMs) can strongly shape social discourse, yet datasets investigating how LLM outputs vary across controlled social and contextual prompting remain sparse. Cognitive Digital Shadows (CDS) is a 190,000-record synthetic corpus supporting analyses of LLM-generated discourse. Each CDS record is generated by one of 19 LLMs, prompted to shadow either a human persona or an AI-assistant role. CDS contains LLM responses on 4 controversial societal topics: vaccines/healthcare, social media disinformation, the gender gap in science, and STEM stereotypes. Persona-conditioned records encode 17 sociodemographic and psychological attributes, providing data linking LLMs'prompts, language, stances and reasoning. Texts are validated for topic anchoring and can support emotional analyses via interpretable NLP (e.g. textual forma mentis networks). CDS is enriched by a pooling platform with user-friendly dashboards, enabling easy, interactive group-level comparisons of emotional and semantic framing across personas, topics and models. The CDS prompting framework supports future audits of LLMs'bias, social sensitivity and alignment.