MilaJun 11, 2026arXiv:2606.13672

$\texttt{WEAVER}$, Better, Faster, Longer: An Effective World Model for Robotic Manipulation

A. Jain, Arnav Kumar Jain, Yilin Wu, Yilin Wu, Jesse Farebrother, Jesse Farebrother, Gokul Swamy, Gokul Swamy, Andrea V. Bajcsy, Andrea Bajcsy

AI Summary

The paper introduces $\texttt{WEAVER}$, a novel world model architecture that effectively balances fidelity, consistency, and efficiency for robotic manipulation tasks. By employing a multi-view approach and a flow-matching loss to predict future latents and reward values, $\texttt{WEAVER}$ achieves state-of-the-art results, including a 38% improvement in real-world success rates during policy improvement. Additionally, it demonstrates a significant speedup in test-time planning, outperforming prior models even in out-of-distribution scenarios.

Key Contribution

Achieving a 38% boost in real-world success rates for robotic manipulation, $\texttt{WEAVER}$ redefines the capabilities of world models in robotics.

Abstract

The potential impacts of world models (WMs, i.e., learned simulators) on robotics are far-reaching -- policy evaluation, policy improvement, and test-time planning -- all with limited real-world interaction. To unlock these downstream capabilities, a WM needs to jointly satisfy three desiderata: $\textit{(i)}$ fidelity (i.e., producing simulated trajectories that correlate with reality), $\textit{(ii)}$ consistency (i.e., producing simulated trajectories that are coherent over long horizons), and $\textit{(iii)}$ efficiency (i.e., producing simulated trajectories quickly). We propose $\texttt{WEAVER}$ (World Estimation Across Views for Embodied Reasoning): a WM architecture that simultaneously achieves all three desiderata, providing state-of-the-art results on robotic manipulation tasks. $\texttt{WEAVER}$ is a multi-view WM trained to predict future latents and reward values via a flow-matching loss. We distill the key design decisions across model architecture, memory, and prediction objectives required to unlock the kinds of long-horizon dynamic manipulation tasks that have confounded prior world modeling approaches. We apply $\texttt{WEAVER}$ in robotic hardware, demonstrating its effectiveness at policy evaluation ($\rho$=0.870 correlation with real-world success rate), policy improvement (real-world success rate improvement of $38\%$ on top of the $\pi_{0.5}$ robot foundation model), and test-time planning (real-world success rate improvement of $14\%$ with a $5-10\times$ speedup over prior WMs). $\texttt{WEAVER}$ also demonstrates better performance than prior WMs when evaluated on out-of-distribution scenarios. Code, models, and videos at: https://arnavkj1995.github.io/WEAVER/ .

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References55

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

$\texttt{WEAVER}$, Better, Faster, Longer: An Effective World Model for Robotic Manipulation

Related Papers