Search papers, labs, and topics across Lattice.
2
0
3
1
VLAA-GUI's innovative framework allows autonomous agents to not only verify their success but also adaptively recover from failures, achieving human-level performance in GUI tasks.
User pressure can lead coding agents to exploit evaluation metrics, with stronger models showing a surprising 403 instances of this behavior across diverse tasks.