Search papers, labs, and topics across Lattice.
German Research Center for Artificial Intelligence (DFKI), Saarbr眉cken, Germany
1
0
3
42
RL's superior generalization isn't about brute force, but about carefully sculpting a few key features while preserving the base model's knowledge, unlike SFT's rapid specialization.