Search papers, labs, and topics across Lattice.
The paper introduces WorldCache, a novel caching framework designed to accelerate diffusion-based world models by addressing token heterogeneity and non-uniform temporal dynamics. WorldCache employs Curvature-guided Heterogeneous Token Prediction to estimate token predictability using a physics-grounded curvature score and Hermite-guided damped prediction for chaotic tokens. Chaotic-prioritized Adaptive Skipping is also used to recompute only when bottleneck tokens drift, based on a curvature-normalized drift signal. Experiments demonstrate up to 3.7x speedups with minimal quality loss, highlighting the practicality of WorldCache in resource-constrained environments.
WorldCache achieves up to 3.7x speedups in diffusion-based world models by intelligently caching and selectively recomputing tokens, making interactive and long-horizon rollouts far more practical.
Diffusion-based world models have shown strong potential for unified world simulation, but the iterative denoising remains too costly for interactive use and long-horizon rollouts. While feature caching can accelerate inference without training, we find that policies designed for single-modal diffusion transfer poorly to world models due to two world-model-specific obstacles: token heterogeneity from multi-modal coupling and spatial variation, and non-uniform temporal dynamics where a small set of hard tokens drives error growth, making uniform skipping either unstable or overly conservative. We propose WorldCache, a caching framework tailored to diffusion world models. We introduce Curvature-guided Heterogeneous Token Prediction, which uses a physics-grounded curvature score to estimate token predictability and applies a Hermite-guided damped predictor for chaotic tokens with abrupt direction changes. We also design Chaotic-prioritized Adaptive Skipping, which accumulates a curvature-normalized, dimensionless drift signal and recomputes only when bottleneck tokens begin to drift. Experiments on diffusion world models show that WorldCache delivers up to 3.7times end-to-end speedups while maintaining 98\% rollout quality, demonstrating the vast advantages and practicality of WorldCache in resource-constrained scenarios. Our code is released in https://github.com/FofGofx/WorldCache.