Search papers, labs, and topics across Lattice.
German Research Center for Artificial Intelligence (DFKI), Saarbr眉cken, Germany
2
0
4
1
RL's superior generalization isn't about brute force, but about carefully sculpting a few key features while preserving the base model's knowledge, unlike SFT's rapid specialization.
Steering isn't just a trick; it's a fundamentally different way to adapt language models, offering localized, reversible control that traditional fine-tuning can't match.