Search papers, labs, and topics across Lattice.
The Chinese University of Hong Kong
1
0
1
DynaCF reveals that dynamically adjusting sample weights based on shortcut sensitivity can drastically improve the robustness of reward models against superficial cues.