Search papers, labs, and topics across Lattice.
Ant Group, Hangzhou, China
1
2
3
Listwise preference optimization for diffusion models (Diffusion-LPO) beats pairwise DPO baselines, finally unlocking the potential of richer ranked human feedback.