Search papers, labs, and topics across Lattice.
2
0
4
A unified assessment framework reveals hidden insights about agent performance, transforming how we evaluate AI systems.
Current agents are alarmingly susceptible to skill-based attacks, with success rates reaching over 86%, exposing a critical vulnerability in AI safety.