Search papers, labs, and topics across Lattice.
1
0
3
2
Stop training black-box reward models: VL-MDR offers a transparent alternative that surfaces *why* a VLM is getting a certain reward, opening the door to more targeted alignment.