Tsinghua AIMar 5, 2026arXiv:2603.05078

MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Juntong Fang, Zequn Chen, Weiqi Zhang, Donglin Di, Xuan Zhang, Xuancheng Zhang, Chen Yang, Chengmin Yang, Yu-Shen Liu

AI Summary

MoRe, a feedforward 4D reconstruction network, is introduced to efficiently recover dynamic 3D scenes from monocular videos by disentangling dynamic motion from static structure using an attention-forcing strategy. The model is fine-tuned on large-scale datasets and uses grouped causal attention to capture temporal dependencies and adapt to varying token lengths. Experiments show MoRe achieves high-quality dynamic reconstructions with exceptional efficiency compared to optimization-based methods.

Key Contribution

Ditch the optimization: MoRe achieves real-time 4D scene reconstruction from monocular video using a feedforward transformer that disentangles motion and structure.

Abstract

Reconstructing dynamic 4D scenes remains challenging due to the presence of moving objects that corrupt camera pose estimation. Existing optimization methods alleviate this issue with additional supervision, but they are mostly computationally expensive and impractical in real-time applications. To address these limitations, we propose MoRe, a feedforward 4D reconstruction network that efficiently recovers dynamic 3D scenes from monocular videos. Built upon a strong static reconstruction backbone, MoRe employs an attention-forcing strategy to disentangle dynamic motion from static structure. To further enhance robustness, we fine-tune the model on large-scale, diverse datasets encompassing both dynamic and static scenes. Moreover, our grouped causal attention captures temporal dependencies and adapts to varying token lengths across frames, ensuring temporally coherent geometry reconstruction. Extensive experiments on multiple benchmarks demonstrate that MoRe achieves high-quality dynamic reconstructions with exceptional efficiency.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References61

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Related Papers