Search papers, labs, and topics across Lattice.
This paper introduces PLUME, a novel world modeling framework that simultaneously learns to estimate uncertain physical parameters and the corresponding system dynamics for multi-finger manipulation tasks. By leveraging a latent space to represent various physical parameters and their associated rewards, PLUME enables efficient online parameter inference without the need for retraining. The approach demonstrates significant improvements in dexterous manipulation tasks, achieving successful zero-shot transfer from simulation to hardware while outperforming existing reinforcement learning and behavior cloning methods.
Achieving zero-shot transfer from simulation to real-world manipulation tasks, PLUME outperforms state-of-the-art methods by effectively handling uncertainty in physical parameters.
Dexterous manipulation with multi-finger hands can be sensitive to physical parameters such as object shape, pose, and friction coefficients. While simulation enables large-scale data collection with known parameter values, simulation-trained policies must still handle uncertainty at deployment, where the true parameters and therefore the true dynamics are unknown. Standard domain randomization strategies may be insufficient for precise tasks like screwdriver turning, as manipulation strategies may need to change depending on specific parameter values. To address this, we propose Probabilistic Latent Unified world Modeling and parameter Estimation (PLUME), a world model that jointly learns to evolve a belief over parameter values as well as the system dynamics conditioned on those parameters. We learn a latent space to jointly represent multiple qualitatively different physical parameters along with rewards, themselves functions of partially-observable variables, to inform planning. Our novel learning framework leads to efficient alignment of the world model to true dynamics through online parameter inference as opposed to re-training or fine-tuning. We evaluate our method on simulated screwdriver turning, valve turning, bucket lifting, and disk flicking tasks, as well as a hardware screwdriver turning task, where we achieve successful zero-shot transfer of our simulation-trained policy and outperform state-of-the-art offline reinforcement learning and world-model-augmented behavior cloning baselines. Please see our website at https://plume-world-model.github.io for videos.