Search papers, labs, and topics across Lattice.
2
0
5
4
Current reward models struggle to distinguish good vs. bad agent behavior in complex tool-using scenarios, especially over long horizons, revealing a critical gap in alignment research.
Instead of forcing modalities to imitate each other, IIBalance lets each modality contribute according to its intrinsic information budget, leading to better multimodal fusion.