Search papers, labs, and topics across Lattice.
This paper introduces Vision Guided Agile Interaction Control (VAIC), a novel framework for enabling humanoid robots to interact with objects in unstructured environments without relying on dense reference trajectories or perfect state observability. By employing a two-stage distillation approach, VAIC trains a privileged teacher policy to master interaction skills and then distills these into a deployable student policy that uses velocity targets and an interaction indicator. The results show that VAIC significantly outperforms existing methods, allowing for successful execution of diverse dynamic tasks such as box carrying and skateboarding, thereby advancing the deployment of autonomous humanoid robots in real-world scenarios.
VAIC enables humanoid robots to perform complex object interactions in real-world settings without the need for perfect state information, achieving superior performance across diverse tasks.
Humanoid robots hold immense potential for real-world assistance, yet agile interaction with objects in unstructured environments demands tightly coupled whole-body coordination. Despite recent advancements, current controllers face a critical deployment gap. They rely heavily on dense reference trajectories and perfect state observability, which inherently limits physical generalization. We present Vision Guided Agile Interaction Control (VAIC), a unified framework that bridges this gap by operating exclusively on onboard depth, historical proprioception, and a decoupled user command interface. VAIC employs a two-stage distillation paradigm. First, a privileged teacher policy masters diverse interaction skills using precise object kinematics and exact environmental states. Second, a deployable student policy distills these capabilities by replacing full body tracking with velocity targets across multiple axes and an interaction indicator for each frame. The student utilizes a recurrent object adaptation module to implicitly infer unobservable object dynamics from raw depth streams and proprioception. Evaluations and real-world deployments on the humanoid robot demonstrate that a single VAIC policy successfully executes highly diverse dynamic tasks. These tasks include box carrying, cart interaction, and skateboarding, consistently outperforming baselines and advancing autonomous humanoid deployment.