Search papers, labs, and topics across Lattice.
1
2
3
Listwise preference optimization for diffusion models (Diffusion-LPO) beats pairwise DPO baselines, finally unlocking the potential of richer ranked human feedback.