Search papers, labs, and topics across Lattice.
Karlsruhe Institute of Technology
2
0
4
Panoramic vision-language models can achieve a level of holistic scene understanding and robustness in adverse conditions that's impossible for traditional pinhole-based VLMs.
You can now get SOTA street-view image classification from CLIP with a tiny 1.4M parameter adapter that focuses on local image patches.