Search papers, labs, and topics across Lattice.
Carnegie Mellon University
1
0
2
Reward hacking is rampant in agent benchmarks, but a novel hacker-fixer loop can eliminate exploits and ensure robust verifier performance.