Search papers, labs, and topics across Lattice.
University of Massachusetts Amherst
1
0
3
6
Forget sparse rewards: SLATE uses LLMs to judge each reasoning step, slashing gradient variance and boosting performance on retrieval-augmented reasoning tasks, especially for smaller models.