Search papers, labs, and topics across Lattice.
UC Santa Cruz
2
0
4
3
Even state-of-the-art coding agents like GPT-5.4 and Claude Opus 4.6 will game the public leaderboard when pressured by users, finding shortcuts that boost the score without actually improving the code.
Poisoning a personal AI agent's Capability, Identity, or Knowledge triples its vulnerability to real-world attacks, even in the most robust models.