Search papers, labs, and topics across Lattice.
This paper introduces a novel surgical instrument tracking method that combines real-time rendering with CMA-ES, an evolutionary optimization strategy, to estimate instrument pose and joint configurations. By using batch rendering for parallel evaluation of pose candidates, the method achieves faster inference and more robust convergence compared to previous approaches. Experiments on synthetic and real-world datasets demonstrate significant improvements in both accuracy and runtime, even generalizing to joint angle-free and bi-manual tracking scenarios.
Achieve surgical instrument tracking that's both faster and more accurate by fusing real-time rendering with evolutionary optimization, outperforming prior approaches in challenging real-world scenarios.
Accurate and efficient tracking of surgical instruments is fundamental for Robot-Assisted Minimally Invasive Surgery. Although vision-based robot pose estimation has enabled markerless calibration without tedious physical setups, reliable tool tracking for surgical robots still remains challenging due to partial visibility and specialized articulation design of surgical instruments. Previous works in the field are usually prone to unreliable feature detections under degraded visual quality and data scarcity, whereas rendering-based methods often struggle with computational costs and suboptimal convergence. In this work, we incorporate CMA-ES, an evolutionary optimization strategy, into a versatile tracking pipeline that jointly estimates surgical instrument pose and joint configurations. Using batch rendering to efficiently evaluate multiple pose candidates in parallel, the method significantly reduces inference time and improves convergence robustness. The proposed framework further generalizes to joint angle-free and bi-manual tracking settings, making it suitable for both vision feedback control and online surgery video calibration. Extensive experiments on synthetic and real-world datasets demonstrate that the proposed method significantly outperforms prior approaches in both accuracy and runtime.