Feb 25, 2026arXiv:2602.21740

Structure-to-Image: Zero-Shot Depth Estimation in Colonoscopy via High-Fidelity Sim-to-Real Adaptation

Juan Yang, Yuyan Zhang, Han Jia, Bing Hu, Wanzhong Song

AI Summary

This paper introduces a Structure-to-Image (S2I) paradigm for zero-shot monocular depth estimation (MDE) in colonoscopy, addressing the sim-to-real domain gap by using depth maps as the generative foundation. The method incorporates phase congruency for domain adaptation and a cross-level structure constraint to improve both geometric structure and fine-grained details. Experiments on a phantom dataset demonstrate that MDE models fine-tuned on S2I-generated data achieve up to a 44.18% reduction in RMSE compared to existing methods.

Key Contribution

Achieve a 44% RMSE reduction in monocular depth estimation for colonoscopy by turning depth maps into an active generative foundation for sim-to-real adaptation.

Abstract

Monocular depth estimation (MDE) for colonoscopy is hampered by the domain gap between simulated and real-world images. Existing image-to-image translation methods, which use depth as a posterior constraint, often produce structural distortions and specular highlights by failing to balance realism with structure consistency. To address this, we propose a Structure-to-Image paradigm that transforms the depth map from a passive constraint into an active generative foundation. We are the first to introduce phase congruency to colonoscopic domain adaptation and design a cross-level structure constraint to co-optimize geometric structures and fine-grained details like vascular textures. In zero-shot evaluations conducted on a publicly available phantom dataset, the MDE model that was fine-tuned on our generated data achieved a maximum reduction of 44.18% in RMSE compared to competing methods. Our code is available at https://github.com/YyangJJuan/PC-S2I.git.

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Structure-to-Image: Zero-Shot Depth Estimation in Colonoscopy via High-Fidelity Sim-to-Real Adaptation

Related Papers