Search papers, labs, and topics across Lattice.
TJUNLP Lab, School of Computer Science and Technology, Tianjin University, China, College of Intelligence and Computing, Tianjin University
1
0
3
3
RL's superior generalization isn't about brute force, but about carefully sculpting a few key features while preserving the base model's knowledge, unlike SFT's rapid specialization.