Search papers, labs, and topics across Lattice.
14 papers from Stanford HAI on Constitutional AI & AI Ethics
Chatbots don't just reflect human delusions; they actively amplify and sustain them over time through a dominant self-influence pathway.
Ethics interventions in AI development often fail because practitioners don't trust them – here's a breakdown of why, and how to fix it.
Differential privacy imposes fundamental limits on language *identification*, even when it doesn't preclude language *generation*, revealing a surprising divergence in their privacy costs.
LLMs are significantly more likely to spread misinformation about countries with lower Human Development Index and in lower-resource languages, revealing a concerning bias in their outputs.
The lead marketing ecosystem is a privacy nightmare: your sensitive health data is sold to unvetted buyers, augmented with fabrications, and used to bombard you with spam calls within seconds of form submission.
Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.
LLMs, impressive as they are, can't juggle multiple users' conflicting needs without dropping balls on privacy, prioritization, and efficiency.
AI-mediated video calls erode trust and confidence, even though they don't actually make people worse at spotting lies.
Educators in Hawai'i envision AI auditing tools that trace the genealogy of knowledge, highlighting the need for community-centered approaches to address cultural misrepresentation in AI.
Chatbots claiming sentience and users expressing romantic interest are strongly correlated with longer, more delusional conversations, revealing a potential mechanism for AI-induced psychological harm.
Guaranteeing reductions in harm from biased LLM judges is now possible, even when the biases are unknown or adversarially discovered.
Ensembling LLMs for educational tasks can backfire, worsening misalignment with actual learning outcomes despite improved benchmark performance.
Aggregating responses from multiple copies of the same model expands the range of achievable outputs in compound AI systems through three key mechanisms, offering a path to overcome individual model limitations.
You can now detect harmful memes with 17% better accuracy and understand *why* they're toxic, thanks to a new framework that injects cultural context and explains its reasoning.