Xihui Liu

Ditch the VAE bottleneck: Representation Forcing lets you train unified multimodal models to generate high-quality images directly from pixels, rivaling VAE-based approaches without the architectural constraint.

Zhijie Lin, Ceyuan Yang, Fei Xiao +5

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Training Efficiency & Optimization

May 6, 2026

May 6, 2026·also Shanghai AI Lab

PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World

Interactive 3D asset generation can now be driven by functional logic and hierarchical physics, thanks to a new framework that synthesizes simulation-ready assets.

Yunhan Yang, Chunshi Wang, Junliang Ye +4

Data Curation & Synthetic Data Robotics & Embodied AI World Models & Planning

Apr 20, 2026

MultiWorld: Scalable Multi-Agent Multi-View Video World Models

Accurately simulating multi-agent interactions with consistent multi-view video is now possible thanks to MultiWorld, a framework that scales to many agents and viewpoints.

Haoyu Wu, Jiwen Yu, Yingtian Zou +1

Computer Vision Multimodal Models World Models & Planning

Apr 1, 2026

Jinkun Hao +7Apr 1, 2026·also Shanghai AI Lab

EgoSim: Egocentric World Simulator for Embodied Interaction Generation

EgoSim delivers spatially consistent and dynamically updating egocentric simulations, outperforming existing methods and enabling cross-embodiment transfer to robotic manipulation.

Jinkun Hao, Mingda Jia, Ruiyan Wang +5

Computer Vision Robotics & Embodied AI World Models & Planning

Search

Xihui Liu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)