Marc Pollefeys

Forget training wheels: this training-free method leverages uncertainty to guide vision-language models to the right image regions, boosting performance on detail-oriented tasks.

Marcel Gropl, Marcel Gröpl, Jaewoo Jung +5

Computer Vision Multimodal Models Recommendation & Information Retrieval

Mar 30, 2026

Haozhe Qi +9Mar 30, 2026·also SJTU

AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

MLLMs can now efficiently process 10K-frame videos without training, by adaptively selecting tokens based on the model's own uncertainty about the content.

Haozhe Qi, Kevin Qu, Kevin Qu +7

Computer Vision Multimodal Models Natural Language Processing

Mar 19, 2026

Moyang Li +3Mar 19, 2026

DROID-SLAM in the Wild

DROID-SLAM achieves robust real-time RGB SLAM in dynamic environments by explicitly modeling per-pixel uncertainty, outperforming existing methods that struggle with unknown dynamic objects and cluttered scenes.

Moyang Li, Zihan Zhu, Marc Pollefeys +1

Computer Vision Robotics & Embodied AI

Mar 12, 2026

Chenyangguang Zhang +5Mar 12, 2026

Controllable Egocentric Video Generation via Occlusion-Aware Sparse 3D Hand Joints

Generate realistic egocentric videos with consistent 3D hand articulation, even with severe occlusions, by using sparse 3D hand joints as control signals.

Chenyangguang Zhang, Botao Ye, Boqi Chen +3

Computer Vision Multimodal Models Robotics & Embodied AI

Search

Marc Pollefeys

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)