Search papers, labs, and topics across Lattice.
Tencent
2
0
4
Most RLVR datasets are just remixes of a few originals, and this paper shows how to trace them back to their source, revealing widespread data contamination.
Pass-rate-1 prompts got you down? Composition-RL boosts LLM reasoning by automatically composing multiple problems into new verifiable questions, making better use of your existing data.