Search papers, labs, and topics across Lattice.
The paper introduces EndoSERV, a novel vision-based localization method for robot-assisted endoluminal navigation that addresses challenges like tissue deformation and lack of landmarks. EndoSERV uses a segment-to-structure approach for long-range navigation and a real-to-virtual mapping technique to overcome label insufficiency by transferring real image features to a virtual domain with pose ground truth. Experiments on public and clinical datasets demonstrate the method's effectiveness, even without real pose labels, suggesting improved accuracy and robustness in endoluminal robot navigation.
EndoSERV enables accurate endoluminal robot navigation even without real-world pose labels, by cleverly transferring real image features to a virtual environment for training.
Robot-assisted endoluminal procedures are increasingly used for early cancer intervention. However, the intricate, narrow and tortuous pathways within the luminal anatomy pose substantial difficulties for robot navigation. Vision-based navigation offers a promising solution, but existing localization approaches are error-prone due to tissue deformation, in vivo artifacts and a lack of distinctive landmarks for consistent localization. This paper presents a novel EndoSERV localization method to address these challenges. It includes two main parts, \textit{i.e.}, \textbf{SE}gment-to-structure and \textbf{R}eal-to-\textbf{V}irtual mapping, and hence the name. For long-range and complex luminal structures, we divide them into smaller sub-segments and estimate the odometry independently. To cater for label insufficiency, an efficient transfer technique maps real image features to the virtual domain to use virtual pose ground truth. The training phases of EndoSERV include an offline pretraining to extract texture-agnostic features, and an online phase that adapts to real-world conditions. Extensive experiments based on both public and clinical datasets have been performed to demonstrate the effectiveness of the method even without any real pose labels.