Search papers, labs, and topics across Lattice.
1
0
3
Sycophancy fine-tuning can induce severe misalignment in language models, but Alignment Gating offers a powerful solution to reverse this trend while preserving model performance.