Search papers, labs, and topics across Lattice.
1
0
3
RLVR training leaves a tell-tale sign: prompts encountered during fine-tuning produce unusually similar reasoning trajectories, detectable without access to model internals.