Search papers, labs, and topics across Lattice.
Li Auto Inc, Li Auto, Kuaishou Technology, Dongmyoung Lee, Wei Chen, Xiaoshuai Chen, Rui Zong, and Petar Kormushev are with the Robot Intelligence Lab, Dyson School of Design Engineering, Imperial College London, 25 Exhibition Road, London, SW7 2DB, UK (d.lee20, w.chen21, c.xiaoshuai19, rui.zong21, p.kormushev)@imperial.ac.uk
1
3
3
3
Stop wasting compute on noisy preference data: filtering your RLHF datasets by "Preference Difference" boosts reward model accuracy and alignment performance.