NSFCUCSDMar 6, 2026arXiv:2603.05888

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Xiang Zhang, Sohyun Yoo, Hongrui Wu, Chuan Li, Zhuowen Tu

AI Summary

PixARMesh autoregressively reconstructs complete 3D indoor scene meshes from a single RGB image by jointly predicting object layout and geometry within a unified mesh generative model. The model augments a point-cloud encoder with pixel-aligned image features and global scene context via cross-attention, enabling accurate spatial reasoning. By generating scenes autoregressively from a unified token stream containing context, pose, and mesh, PixARMesh achieves state-of-the-art reconstruction quality with lightweight, high-quality meshes.

Key Contribution

Forget messy SDFs and post-hoc hacks: PixARMesh directly generates artist-ready 3D scene meshes from a single image in one fell swoop.

Abstract

We introduce PixARMesh, a method to autoregressively reconstruct complete 3D indoor scene meshes directly from a single RGB image. Unlike prior methods that rely on implicit signed distance fields and post-hoc layout optimization, PixARMesh jointly predicts object layout and geometry within a unified model, producing coherent and artist-ready meshes in a single forward pass. Building on recent advances in mesh generative models, we augment a point-cloud encoder with pixel-aligned image features and global scene context via cross-attention, enabling accurate spatial reasoning from a single image. Scenes are generated autoregressively from a unified token stream containing context, pose, and mesh, yielding compact meshes with high-fidelity geometry. Experiments on synthetic and real-world datasets show that PixARMesh achieves state-of-the-art reconstruction quality while producing lightweight, high-quality meshes ready for downstream applications.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References52

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

PixARMesh: Autoregressive Mesh-Native Single-View Scene Reconstruction

Related Papers