Search papers, labs, and topics across Lattice.
1
0
3
Forget training loops and labels: this method selects high-value reasoning examples for RL using only a single forward pass through the model.