Search papers, labs, and topics across Lattice.
1
0
3
Explicitly grounding MLLMs with object boxes during inference can actually *hurt* performance, but this work shows how to bake visual localization into the reasoning process itself.