DAMOPKUTongjiZJUFeb 13, 2026arXiv:2602.14979

RynnBrain: Open Embodied Foundation Models

Ronghao Dang, Jiayan Guo, Jiayan Guo, Bohan Hou, Bohan Hou, Sicong Leng, Kehan Li, Xin Li, Jiangpin Liu, Yunxuan Mao, Zhikai Wang, Yuqian Yuan, Xiao Lin, Yang Bai, Yang Bai, Yaxi Zhao, Yaxi Zhao, Min Zeng, Minghua Zeng, Ju Gao, Yuming Jiang, Jun Cen, Siteng Huang, Liuyi Wang, Wenqiao Zhang, Chengju Liu, Chengju Liu, Jianfei Yang, Jianfei Yang, Shijian Lu, Deli Zhao

AI Summary

The paper introduces RynnBrain, a family of open-source spatiotemporal foundation models (2B, 8B, 30B-A3B MoE) designed to unify perception, reasoning, and planning for embodied intelligence. RynnBrain enhances egocentric understanding, spatiotemporal localization, physically grounded reasoning, and physics-aware planning within a single framework. Evaluations across 20 embodied benchmarks and 8 general vision benchmarks demonstrate that RynnBrain significantly outperforms existing embodied foundation models, particularly in physically grounded reasoning and adaptation to diverse embodied tasks.

Key Contribution

RynnBrain leapfrogs existing embodied foundation models, offering a unified, open-source spatiotemporal model that excels at physically grounded reasoning and planning across a wide range of benchmarks.

Abstract

Despite rapid progress in multimodal foundation models, embodied intelligence community still lacks a unified, physically grounded foundation model that integrates perception, reasoning, and planning within real-world spatial-temporal dynamics. We introduce RynnBrain, an open-source spatiotemporal foundation model for embodied intelligence. RynnBrain strengthens four core capabilities in a unified framework: comprehensive egocentric understanding, diverse spatiotemporal localization, physically grounded reasoning, and physics-aware planning. The RynnBrain family comprises three foundation model scales (2B, 8B, and 30B-A3B MoE) and four post-trained variants tailored for downstream embodied tasks (i.e., RynnBrain-Nav, RynnBrain-Plan, and RynnBrain-VLA) or complex spatial reasoning tasks (i.e., RynnBrain-CoP). In terms of extensive evaluations on 20 embodied benchmarks and 8 general vision understanding benchmarks, our RynnBrain foundation models largely outperform existing embodied foundation models by a significant margin. The post-trained model suite further substantiates two key potentials of the RynnBrain foundation model: (i) enabling physically grounded reasoning and planning, and (ii) serving as a strong pretrained backbone that can be efficiently adapted to diverse embodied tasks.

Multimodal Models Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References116

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

RynnBrain: Open Embodied Foundation Models

Related Papers