Search papers, labs, and topics across Lattice.
2
0
4
0
A training-free feature adjustment pipeline unlocks the power of Visual Geometry Grounded Transformers for stereo vision, achieving state-of-the-art results on KITTI.
Stop your Med-VQA model from "hallucinating" answers: this plug-in framework forces LLMs to actually *look* at the image by bottlenecking visual information through question-conditioned cues.