Search papers, labs, and topics across Lattice.
2
6
4
A global consensus on AI safety risks and capabilities has emerged from a panel of 100+ independent experts, representing a landmark effort in international collaboration.
LLMs may already possess surprisingly strong self-awareness of concept manipulation, detectable via mechanistic interpretability techniques, even when they deny it in their outputs.