Mar 12, 2026arXiv:2603.11911

InSpatio-WorldFM: An Open-Source Real-Time Generative Frame Model

InSpatio Team, InSpatio Team Xiaoyu Zhang, Xiaoyu Zhang, Weihong Pan, Zhichao Ye, Jialin Liu, Yipeng Chen, Nan Wang, Xiaojun Xiang, Weijian Xie, Yifu Wang, Haoyu Ji, Sijia Pan, Zhewen Le, Siji Pan, Jingwen Guo, Xianbin Liu, Jing Guo, Donghui Shen, Ziqiang Zhao, Haomin Liu, Guofeng Zhang

AI Summary

InSpatio-WorldFM is introduced as an open-source, real-time frame model for spatial intelligence that generates frames independently, unlike video-based models. It enforces multi-view spatial consistency using 3D anchors and spatial memory, preserving scene geometry and visual details. A three-stage training pipeline distills a pretrained image diffusion model into a controllable, real-time frame generator.

Key Contribution

Ditch the video: InSpatio-WorldFM achieves real-time spatial intelligence by generating frames independently, offering a low-latency alternative to video-based world models.

Abstract

We present InSpatio-WorldFM, an open-source real-time frame model for spatial intelligence. Unlike video-based world models that rely on sequential frame generation and incur substantial latency due to window-level processing, InSpatio-WorldFM adopts a frame-based paradigm that generates each frame independently, enabling low-latency real-time spatial inference. By enforcing multi-view spatial consistency through explicit 3D anchors and implicit spatial memory, the model preserves global scene geometry while maintaining fine-grained visual details across viewpoint changes. We further introduce a progressive three-stage training pipeline that transforms a pretrained image diffusion model into a controllable frame model and finally into a real-time generator through few-step distillation. Experimental results show that InSpatio-WorldFM achieves strong multi-view consistency while supporting interactive exploration on consumer-grade GPUs, providing an efficient alternative to traditional video-based world models for real-time world simulation.

Computer Vision Open-Source Models & Weights World Models & Planning

Citation Metrics

Citations0

Influential citations0

References49

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

InSpatio-WorldFM: An Open-Source Real-Time Generative Frame Model

Related Papers