Search papers, labs, and topics across Lattice.
TJUNLP Lab, School of Computer Science and Technology, Tianjin University, China
1
0
3
8
RL's superior generalization isn't about brute force, but about carefully sculpting a few key features while preserving the base model's knowledge, unlike SFT's rapid specialization.