Search papers, labs, and topics across Lattice.
2
0
5
MLLMs can "think" with images, but their actions often don't match their reasoning, and this paper solves that with a new training method that forces them to explain what they see.
Correcting a vision-language model's "hallucinations" is now as simple as pinpointing and editing the right intermediate representation, sidestepping costly retraining or dual inference.