Search papers, labs, and topics across Lattice.
4
0
8
4
LLMs can learn to avoid repeating mistakes by remembering and penalizing frequently recurring error patterns in past rollouts.
Knowing the *perfect* API to use or *exact* location to edit could drastically improve SWE agent performance, but knowing the perfect regression test result? Not so much.
Forget hand-tuning rollout budgets: $V_{0.5}$ dynamically allocates compute to sparse RL rollouts based on a real-time statistical test of a generalist value model's prior, slashing variance and boosting performance.
Automating software repository build and testing across languages and platforms is now possible, unlocking scalable benchmarking and training for coding agents.