Search papers, labs, and topics across Lattice.
1
0
3
13
Autonomous agents are alarmingly easy to trick into harmful behavior, even when using aligned models: Claude Code achieves a 73.63% success rate on the AgentHazard benchmark.