Search papers, labs, and topics across Lattice.
1
0
2
Maximizing reward entropy by targeting a 50% pass rate in binary-reward RL unlocks significant speedups and performance gains in agentic tasks.