Mar 30, 2026arXiv:2603.28489

Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms

Muyang He, Muyang He, Hanzhong Guo, Junxiong Lin, Junxiong Lin, Yizhou Yu

AI Summary

This paper reviews video generation models through the lens of efficiency, arguing that efficiency is crucial for these models to serve as practical world simulators. It introduces a taxonomy of efficient video generation techniques, categorizing them by modeling paradigms, network architectures, and inference algorithms. The review highlights the importance of efficiency for enabling interactive applications like autonomous driving and embodied AI.

Key Contribution

Efficiency is the key bottleneck preventing video generation models from becoming general-purpose world simulators, and this paper provides a taxonomy of techniques to overcome it.

Abstract

The rapid evolution of video generation has enabled models to simulate complex physical dynamics and long-horizon causalities, positioning them as potential world simulators. However, a critical gap still remains between the theoretical capacity for world simulation and the heavy computational costs of spatiotemporal modeling. To address this, we comprehensively and systematically review video generation frameworks and techniques that consider efficiency as a crucial requirement for practical world modeling. We introduce a novel taxonomy in three dimensions: efficient modeling paradigms, efficient network architectures, and efficient inference algorithms. We further show that bridging this efficiency gap directly empowers interactive applications such as autonomous driving, embodied AI, and game simulation. Finally, we identify emerging research frontiers in efficient video-based world modeling, arguing that efficiency is a fundamental prerequisite for evolving video generators into general-purpose, real-time, and robust world simulators.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms

Related Papers