Search papers, labs, and topics across Lattice.
University of Edinburgh
1
0
RLHF's generalization gap can be decomposed into distinct error terms arising from reward shift and KL clipping, offering a more nuanced understanding of its limitations.