May 28, 2026arXiv:2605.29891

DVSM: Decoder-only View Synthesis Model Done Right

Cheng Sun, Jaesung Choe, Min-Hung Chen, Ryo Hachiuma, Yu-Chiang Frank Wang

AI Summary

This paper challenges the prevailing encoder-decoder architecture in Large View Synthesis Models (LVSMs) by demonstrating that a decoder-only architecture, representing scenes implicitly as a KV-cache, achieves superior performance with fewer parameters. Weight sharing between the color-input reconstruction network and the camera-only rendering network is shown to improve feature alignment and synthesis quality. The proposed Decoder-only View Synthesis Model (DVSM), incorporating foundation model priors and stage-wise patch sizing, establishes a new state-of-the-art in novel view synthesis across multiple benchmarks.

Key Contribution

Encoder-decoders are out: a decoder-only architecture for large view synthesis not only slashes parameters but also beats the state-of-the-art, even outperforming per-scene optimized 3DGS in some cases.

Abstract

Recent Large View Synthesis Models (LVSMs) advocate an encoder-decoder architecture that separates reconstruction and rendering into distinct networks. We re-examine this design. Through controlled experiments, we show that a decoder-only architecture, which represents scenes implicitly as a KV-cache, outperforms encoder-decoder variants while using fewer parameters at identical rendering complexity. Further analysis shows that sharing weights between the color-input reconstruction network and the camera-only rendering network better aligns their features at the same viewpoint, facilitating image synthesis. Building on this finding, our model, dubbed DVSM, further incorporates foundation model priors and stage-wise patch sizing for an improved efficiency-quality tradeoff. Our results establish a new state of the art for novel-view synthesis across multiple benchmarks, in some cases even outperforming per-scene-optimized 3DGS under dense input views.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References99

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DVSM: Decoder-only View Synthesis Model Done Right

Related Papers