ColumbiaMar 3, 2026arXiv:2603.03243

HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations

Xiaomeng Xu, Jisang Park, Han Zhang, Eric Cousineau, E. Cousineau, Aditya Bhat, Jose Barreiros, Dian Wang, Shuran Song

AI Summary

The paper introduces HoMMI, a framework for learning whole-body mobile manipulation policies from robot-free human demonstrations using egocentric sensing. To address the human-to-robot embodiment gap, they employ a cross-embodiment hand-eye policy design with embodiment-agnostic visual representations and a whole-body controller. The framework enables learning of long-horizon mobile manipulation tasks requiring bimanual coordination, navigation, and active perception.

Key Contribution

Skip the expensive robots: HoMMI lets you train whole-body mobile manipulation policies just from human demos, thanks to a clever cross-embodiment design.

Abstract

We present Whole-Body Mobile Manipulation Interface (HoMMI), a data collection and policy learning framework that learns whole-body mobile manipulation directly from robot-free human demonstrations. We augment UMI interfaces with egocentric sensing to capture the global context required for mobile manipulation, enabling portable, robot-free, and scalable data collection. However, naively incorporating egocentric sensing introduces a larger human-to-robot embodiment gap in both observation and action spaces, making policy transfer difficult. We explicitly bridge this gap with a cross-embodiment hand-eye policy design, including an embodiment agnostic visual representation; a relaxed head action representation; and a whole-body controller that realizes hand-eye trajectories through coordinated whole-body motion under robot-specific physical constraints. Together, these enable long-horizon mobile manipulation tasks requiring bimanual and whole-body coordination, navigation, and active perception. Results are best viewed on: https://hommi-robot.github.io

Data Curation & Synthetic Data Multimodal Models Robotics & Embodied AI Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References50

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations

Related Papers