Search papers, labs, and topics across Lattice.
1
0
3
Multilingual LLMs can be made significantly more reliable by directly optimizing for crosslingual consistency using a DPO-inspired method that requires no explicit reward model.