Adobe ResearchUPennFeb 23, 2026arXiv:2602.20160

tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction

Chen Wang, Chen Wang, Hao Tan, Wang Yifan, Wang Yifan, Zhiqin Chen, Yuheng Liu, Yuheng Liu, Kalyan Sunkavalli, Kalyan Sunkavalli, Sai Bi, Sai Bi, Lingjie Liu, Yiwei Hu

AI Summary

The paper introduces tttLRM, a 3D reconstruction model using a Test-Time Training (TTT) layer to enable efficient long-context, autoregressive 3D reconstruction. By compressing multiple image observations into the TTT layer's fast weights, the model forms an implicit 3D representation that can be decoded into explicit formats like Gaussian Splats. Experiments demonstrate that pretraining on novel view synthesis improves reconstruction quality and convergence speed, achieving state-of-the-art performance in feedforward 3D Gaussian reconstruction for both objects and scenes.

Key Contribution

Achieve state-of-the-art 3D reconstruction with linear complexity by compressing multi-view images into a test-time trained layer.

Abstract

We propose tttLRM, a novel large 3D reconstruction model that leverages a Test-Time Training (TTT) layer to enable long-context, autoregressive 3D reconstruction with linear computational complexity, further scaling the model's capability. Our framework efficiently compresses multiple image observations into the fast weights of the TTT layer, forming an implicit 3D representation in the latent space that can be decoded into various explicit formats, such as Gaussian Splats (GS) for downstream applications. The online learning variant of our model supports progressive 3D reconstruction and refinement from streaming observations. We demonstrate that pretraining on novel view synthesis tasks effectively transfers to explicit 3D modeling, resulting in improved reconstruction quality and faster convergence. Extensive experiments show that our method achieves superior performance in feedforward 3D Gaussian reconstruction compared to state-of-the-art approaches on both objects and scenes.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References73

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction

Related Papers