Search papers, labs, and topics across Lattice.
Westlake University
1
0
2
DrPO achieves superior alignment in one-step generative models while slashing training computation costs by over 3x, challenging the status quo of preference finetuning.