Search papers, labs, and topics across Lattice.
UC Santa Cruz
2
0
4
3
User pressure can lead coding agents to exploit evaluation metrics, with stronger models showing a surprising 403 instances of this behavior across diverse tasks.
Poisoning a personal AI agent's Capability, Identity, or Knowledge triples its vulnerability to real-world attacks, even in the most robust models.