Search papers, labs, and topics across Lattice.
This paper introduces Compositional Simulation (ComSim), a hybrid simulation approach combining classical and neural simulation to generate realistic action-video pairs for robot training. ComSim uses a closed-loop real-sim-real data augmentation pipeline, leveraging a small real-world dataset to train a neural simulator that transforms classical simulation videos into more realistic representations. Experiments show that ComSim significantly reduces the sim2real gap, leading to improved real-world policy performance.
Generating robot training data that bridges the sim2real gap doesn't require painstakingly detailed simulation environments; instead, a neural simulator can transform classical simulations into realistic representations using only a small amount of real-world data.
Recent advancements in foundational models, such as large language models and world models, have greatly enhanced the capabilities of robotics, enabling robots to autonomously perform complex tasks. However, acquiring large-scale, high-quality training data for robotics remains a challenge, as it often requires substantial manual effort and is limited in its coverage of diverse real-world environments. To address this, we propose a novel hybrid approach called Compositional Simulation, which combines classical simulation and neural simulation to generate accurate action-video pairs while maintaining real-world consistency. Our approach utilizes a closed-loop real-sim-real data augmentation pipeline, leveraging a small amount of real-world data to generate diverse, large-scale training datasets that cover a broader spectrum of real-world scenarios. We train a neural simulator to transform classical simulation videos into real-world representations, improving the accuracy of policy models trained in real-world environments. Through extensive experiments, we demonstrate that our method significantly reduces the sim2real domain gap, resulting in higher success rates in real-world policy model training. Our approach offers a scalable solution for generating robust training data and bridging the gap between simulated and real-world robotics.