Search papers, labs, and topics across Lattice.
1
0
3
LLMs can be made significantly more robust to jailbreaks by weighting the reasoning steps in DPO training, leading to more principled refusals.