Search papers, labs, and topics across Lattice.
Karlsruhe Institute of Technology
3
0
5
Visual in-context learning models struggle with adaptation, revealing critical limitations across 106 dataset-task combinations.
Correcting errors in long-video understanding doesn't have to be a nightmare: IMPACT-CYCLE slashes human arbitration costs by 4.8x while boosting VQA accuracy by intelligently decomposing the task and focusing human effort where it matters most.
Panoramic vision-language models can achieve a level of holistic scene understanding and robustness in adverse conditions that's impossible for traditional pinhole-based VLMs.