Search papers, labs, and topics across Lattice.
Harbin Institute of Technology
1
0
2
STRATAGEM reveals that selectively reinforcing reasoning trajectories can dramatically enhance a model's ability to transfer reasoning skills across diverse tasks, especially in complex mathematical scenarios.