Search papers, labs, and topics across Lattice.
1
0
3
13
Forget expensive human annotation: this self-play method lets LLMs bootstrap their own training signals for open-ended tasks by generating rubrics to evaluate their own outputs.