Search papers, labs, and topics across Lattice.
1
4
2
5
Forget RLHF and DPO – DRAGON lets you fine-tune generative models with rewards that compare entire *distributions* of outputs, unlocking better control and quality without human preference data.