Search papers, labs, and topics across Lattice.
2
0
4
MLLMs can "think" with images, but their actions often don't match their reasoning, and this paper solves that with a new training method that forces them to explain what they see.
Training-free image editing can now beat fine-tuned models, thanks to a clever way of injecting visual context and ensuring concept alignment.