Search papers, labs, and topics across Lattice.
ArthroCut, a novel autonomous policy learning framework, was developed to enhance knee arthroplasty robots by enabling context-aware action generation. The framework fine-tunes a Qwen-VL backbone on a multimodal dataset of 21 knee arthroplasty cases, integrating preoperative imaging, intraoperative tracking, surgical video, and robot state. Bench-top experiments demonstrated an 86% average success rate across six standard resections, significantly outperforming baselines, with ablation studies highlighting the importance of both time-aligned surgical tokens and preoperative imaging tokens.
Achieve near-human success rates in autonomous robotic knee arthroplasty by fusing preoperative imaging with real-time surgical data to guide tokenized action generation.
Despite rapid commercialization of surgical robots, their autonomy and real-time decision-making remain limited in practice. To address this gap, we propose ArthroCut, an autonomous policy learning framework that upgrades knee arthroplasty robots from assistive execution to context-aware action generation. ArthroCut fine-tunes a Qwen--VL backbone on a self-built, time-synchronized multimodal dataset from 21 complete cases (23,205 RGB--D pairs), integrating preoperative CT/MR, intraoperative NDI tracking of bones and end effector, RGB--D surgical video, robot state, and textual intent. The method operates on two complementary token families -- Preoperative Imaging Tokens (PIT) to encode patient-specific anatomy and planned resection planes, and Time-Aligned Surgical Tokens (TAST) to fuse real-time visual, geometric, and kinematic evidence -- and emits an interpretable action grammar under grammar/safety-constrained decoding. In bench-top experiments on a knee prosthesis across seven trials, ArthroCut achieves an average success rate of 86% over the six standard resections, significantly outperforming strong baselines trained under the same protocol. Ablations show that TAST is the principal driver of reliability while PIT provides essential anatomical grounding, and their combination yields the most stable multi-plane execution. These results indicate that aligning preoperative geometry with time-aligned intraoperative perception and translating that alignment into tokenized, constrained actions is an effective path toward robust, interpretable autonomy in orthopedic robotic surgery.