Search papers, labs, and topics across Lattice.
This paper introduces FrameCrafter, a novel approach to sparse novel view synthesis (NVS) that leverages video diffusion models by reformulating NVS as a low frame-rate video completion task. To handle the unordered nature of sparse NVS inputs, the authors propose architectural modifications to video models, including per-frame latent encodings and removal of temporal positional embeddings, effectively making the models permutation-invariant. Experiments demonstrate that video models can be adapted to NVS with minimal supervision, achieving competitive performance on sparse-view NVS benchmarks.
Video diffusion models already contain implicit multi-view knowledge, making them surprisingly effective for novel view synthesis when adapted to ignore temporal coherence.
We tackle the problem of sparse novel view synthesis (NVS) using video diffusion models; given $K$ ($\approx 5$) multi-view images of a scene and their camera poses, we predict the view from a target camera pose. Many prior approaches leverage generative image priors encoded via diffusion models. However, models trained on single images lack multi-view knowledge. We instead argue that video models already contain implicit multi-view knowledge and so should be easier to adapt for NVS. Our key insight is to formulate sparse NVS as a low frame-rate video completion task. However, one challenge is that sparse NVS is defined over an unordered set of inputs, often too sparse to admit a meaningful order, so the models should be $\textit{invariant}$ to permutations of that input set. To this end, we present FrameCrafter, which adapts video models (naturally trained with coherent frame orderings) to permutation-invariant NVS through several architectural modifications, including per-frame latent encodings and removal of temporal positional embeddings. Our results suggest that video models can be easily trained to"forget"about time with minimal supervision, producing competitive performance on sparse-view NVS benchmarks. Project page: https://frame-crafter.github.io/