Apr 14, 2026arXiv:2604.12565

Scalable Trajectory Generation for Whole-Body Mobile Manipulation

Yida Niu, Xinhai Chang, Xin Liu, Ziyuan Jiao, Ziyuan Jiao, Yixin Zhu

AI Summary

The paper introduces AutoMoMa, a GPU-accelerated framework for generating large-scale, physically valid trajectories for whole-body mobile manipulation by unifying AKR modeling with parallelized trajectory optimization. This approach achieves a speedup of over 80x compared to CPU-based baselines, enabling the creation of a dataset with over 500k trajectories spanning diverse scenes, objects, and robot embodiments. Downstream imitation learning experiments demonstrate that data scarcity, rather than algorithmic limitations, is the primary bottleneck in achieving high success rates on articulated-object manipulation tasks.

Key Contribution

Forget painstakingly teleoperating robots or waiting for CPU planners: AutoMoMa unlocks 80x faster generation of kinematically valid mobile manipulation trajectories, finally making large-scale data practical.

Abstract

Robots deployed in unstructured environments must coordinate whole-body motion -- simultaneously moving a mobile base and arm -- to interact with the physical world. This coupled mobility and dexterity yields a state space that grows combinatorially with scene and object diversity, demanding datasets far larger than those sufficient for fixed-base manipulation. Yet existing acquisition methods, including teleoperation and planning, are either labor-intensive or computationally prohibitive at scale. The core bottleneck is the lack of a scalable pipeline for generating large-scale, physically valid, coordinated trajectory data across diverse embodiments and environments. Here we introduce AutoMoMa, a GPU-accelerated framework that unifies AKR modeling, which consolidates base, arm, and object kinematics into a single chain, with parallelized trajectory optimization. AutoMoMa achieves 5,000 episodes per GPU-hour (over $80\times$ faster than CPU-based baselines), producing a dataset of over 500k physically valid trajectories spanning 330 scenes, diverse articulated objects, and multiple robot embodiments. Prior datasets were forced to compromise on scale, diversity, or kinematic fidelity; AutoMoMa addresses all three simultaneously. Training downstream IL policies further reveals that even a single articulated-object task requires tens of thousands of demonstrations for SOTA methods to reach $\approx 80\%$ success, confirming that data scarcity -- not algorithmic limitations -- has been the binding constraint. AutoMoMa thus bridges high-performance planning and reliable IL-based control, providing the infrastructure previously missing for coordinated mobile manipulation research. By making large-scale, kinematically valid training data practical, AutoMoMa showcases generalizable whole-body robot policies capable of operating in the diverse, unstructured settings of the real world.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Scalable Trajectory Generation for Whole-Body Mobile Manipulation

Related Papers