Search papers, labs, and topics across Lattice.
Westlake University
2
0
5
DrPO achieves superior alignment in one-step generative models while slashing training computation costs by over 3x, challenging the status quo of preference finetuning.
PEFT methods aren't just about downstream accuracy; they have distinct "stability-plasticity profiles" that reveal how well they retain general capabilities, and most overshoot the optimal balance anyway.