Search papers, labs, and topics across Lattice.
1
0
3
6
Forget hand-tuning rollout budgets: $V_{0.5}$ dynamically allocates compute to sparse RL rollouts based on a real-time statistical test of a generalist value model's prior, slashing variance and boosting performance.