NVIDIAFeb 16, 2026arXiv:2602.14751

Depth Completion as Parameter-Efficient Test-Time Adaptation

Bingxin Ke, Qunjie Zhou, Jiahui Huang, Xuanchi Ren, Tianchang Shen, Konrad Schindler, Laura Leal-Taixé, Shengyu Huang

AI Summary

The paper introduces CAPA, a parameter-efficient test-time adaptation framework for depth completion that leverages pre-trained 3D foundation models (FMs) and sparse geometric cues. CAPA freezes the FM backbone and updates a minimal set of parameters using parameter-efficient fine-tuning techniques like LoRA or VPT, guided by gradients from sparse observations. By grounding the FM's geometric prior with scene-specific measurements and incorporating sequence-level parameter sharing for videos, CAPA achieves state-of-the-art depth completion results on indoor and outdoor datasets.

Key Contribution

Achieve state-of-the-art depth completion by adapting 3D foundation models at test time with minimal parameter updates, outperforming task-specific encoders that often overfit.

Abstract

We introduce CAPA, a parameter-efficient test-time optimization framework that adapts pre-trained 3D foundation models (FMs) for depth completion, using sparse geometric cues. Unlike prior methods that train task-specific encoders for auxiliary inputs, which often overfit and generalize poorly, CAPA freezes the FM backbone. Instead, it updates only a minimal set of parameters using Parameter-Efficient Fine-Tuning (e.g. LoRA or VPT), guided by gradients calculated directly from the sparse observations available at inference time. This approach effectively grounds the foundation model's geometric prior in the scene-specific measurements, correcting distortions and misplaced structures. For videos, CAPA introduces sequence-level parameter sharing, jointly adapting all frames to exploit temporal correlations, improve robustness, and enforce multi-frame consistency. CAPA is model-agnostic, compatible with any ViT-based FM, and achieves state-of-the-art results across diverse condition patterns on both indoor and outdoor datasets. Project page: research.nvidia.com/labs/dvl/projects/capa.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Depth Completion as Parameter-Efficient Test-Time Adaptation

Related Papers