Search papers, labs, and topics across Lattice.
5 papers from Google DeepMind on Constitutional AI & AI Ethics
LLMs get *more* honest when they have time to reason, defying human tendencies and revealing surprising insights about their internal representational geometry.
LLMs are becoming "epistemic agents" that shape our knowledge environment, so we need a new framework for evaluating and governing them based on trustworthiness, not just performance.
Reasoning-based safety guardrails, once thought to be a strong defense against jailbreaks, crumble with just a few strategically placed tokens.
DPO's success isn't just clever engineering鈥攊t's deeply rooted in human choice theory, unlocking a surprisingly flexible framework for preference optimization and justifying many DPO extensions.
LVLMs struggle to navigate cultural nuances, with even the best models achieving only 62% awareness and 38% compliance on a new benchmark spanning 16 countries.