Search papers, labs, and topics across Lattice.
InfinityStory is a framework for generating long-form storytelling videos with consistent visual narratives, addressing limitations in background consistency, multi-subject shot transitions, and scalability. It introduces a background-consistent generation pipeline that preserves character identity and spatial relationships across scenes. A transition-aware video synthesis module generates smooth shot transitions, supported by a new synthetic dataset of 10,000 multi-subject transition sequences.
Finally, AI can generate hour-long videos with consistent characters and backgrounds, thanks to a new framework that nails seamless transitions between shots.
Generating long-form storytelling videos with consistent visual narratives remains a significant challenge in video synthesis. We present a novel framework, dataset, and a model that address three critical limitations: background consistency across shots, seamless multi-subject shot-to-shot transitions, and scalability to hour-long narratives. Our approach introduces a background-consistent generation pipeline that maintains visual coherence across scenes while preserving character identity and spatial relationships. We further propose a transition-aware video synthesis module that generates smooth shot transitions for complex scenarios involving multiple subjects entering or exiting frames, going beyond the single-subject limitations of prior work. To support this, we contribute with a synthetic dataset of 10,000 multi-subject transition sequences covering underrepresented dynamic scene compositions. On VBench, InfinityStory achieves the highest Background Consistency (88.94), highest Subject Consistency (82.11), and the best overall average rank (2.80), showing improved stability, smoother transitions, and better temporal coherence.