Search papers, labs, and topics across Lattice.
University of North Carolina at Chapel Hill
1
0
3
0
RLVR training leaves a tell-tale sign: prompts encountered during fine-tuning produce unusually similar reasoning trajectories, detectable without access to model internals.