Search papers, labs, and topics across Lattice.
The paper introduces RoboGene, an agentic framework for automatically generating diverse and physically plausible robotic manipulation tasks to address the scarcity of real-world robotic interaction data. RoboGene employs diversity-driven sampling, self-reflection mechanisms for physical constraint enforcement, and human-in-the-loop refinement. Experiments demonstrate that VLA models pre-trained with RoboGene-generated data achieve higher success rates and better generalization compared to those trained with data generated by SOTA foundation models like GPT-4o and Gemini 2.5 Pro.
Forget GPT-4o, the secret to better robot manipulation might be an agentic framework that generates diverse, physically plausible tasks, leading to superior VLA pre-training.
The pursuit of general-purpose robotic manipulation is hindered by the scarcity of diverse, real-world interaction data. Unlike data collection from web in vision or language, robotic data collection is an active process incurring prohibitive physical costs. Consequently, automated task curation to maximize data value remains a critical yet under-explored challenge. Existing manual methods are unscalable and biased toward common tasks, while off-the-shelf foundation models often hallucinate physically infeasible instructions. To address this, we introduce RoboGene, an agentic framework designed to automate the generation of diverse, physically plausible manipulation tasks across single-arm, dual-arm, and mobile robots. RoboGene integrates three core components: diversity-driven sampling for broad task coverage, self-reflection mechanisms to enforce physical constraints, and human-in-the-loop refinement for continuous improvement. We conduct extensive quantitative analysis and large-scale real-world experiments, collecting datasets of 18k trajectories and introducing novel metrics to assess task quality, feasibility, and diversity. Results demonstrate that RoboGene significantly outperforms state-of-the-art foundation models (e.g., GPT-4o, Gemini 2.5 Pro). Furthermore, real-world experiments show that VLA models pre-trained with RoboGene achieve higher success rates and superior generalization, underscoring the importance of high-quality task generation. Our project is available at https://robogene-boost-vla.github.io.