Search papers, labs, and topics across Lattice.
University of Science and Technology of China
1
0
3
Noisy multimodal preference datasets are holding back reward model performance, but DT2IT-MRM offers a scalable curation strategy that achieves state-of-the-art results.