Mar 6, 2026arXiv:2603.06331

WorldCache: Accelerating World Models for Free via Heterogeneous Token Caching

Weilun Feng, Guoxin Fan, Haotong Qin, Chuanguang Yang, Mingqiang Wu, Yuqi Li, Xiangqi Li, Zhulin An, Libo Huang, Dingrui Wang, Longlong Liao, Yongjun Xu

AI Summary

The paper introduces WorldCache, a novel caching framework designed to accelerate diffusion-based world models by addressing token heterogeneity and non-uniform temporal dynamics. WorldCache employs Curvature-guided Heterogeneous Token Prediction to estimate token predictability using a physics-grounded curvature score and Hermite-guided damped prediction for chaotic tokens. Chaotic-prioritized Adaptive Skipping is also used to recompute only when bottleneck tokens drift, based on a curvature-normalized drift signal. Experiments demonstrate up to 3.7x speedups with minimal quality loss, highlighting the practicality of WorldCache in resource-constrained environments.

Key Contribution

WorldCache achieves up to 3.7x speedups in diffusion-based world models by intelligently caching and selectively recomputing tokens, making interactive and long-horizon rollouts far more practical.

Abstract

Diffusion-based world models have shown strong potential for unified world simulation, but the iterative denoising remains too costly for interactive use and long-horizon rollouts. While feature caching can accelerate inference without training, we find that policies designed for single-modal diffusion transfer poorly to world models due to two world-model-specific obstacles: token heterogeneity from multi-modal coupling and spatial variation, and non-uniform temporal dynamics where a small set of hard tokens drives error growth, making uniform skipping either unstable or overly conservative. We propose WorldCache, a caching framework tailored to diffusion world models. We introduce Curvature-guided Heterogeneous Token Prediction, which uses a physics-grounded curvature score to estimate token predictability and applies a Hermite-guided damped predictor for chaotic tokens with abrupt direction changes. We also design Chaotic-prioritized Adaptive Skipping, which accumulates a curvature-normalized, dimensionless drift signal and recomputes only when bottleneck tokens begin to drift. Experiments on diffusion world models show that WorldCache delivers up to 3.7times end-to-end speedups while maintaining 98\% rollout quality, demonstrating the vast advantages and practicality of WorldCache in resource-constrained scenarios. Our code is released in https://github.com/FofGofx/WorldCache.

Inference & Quantization Multimodal Models World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

WorldCache: Accelerating World Models for Free via Heterogeneous Token Caching

Related Papers