Search papers, labs, and topics across Lattice.
This paper introduces Ultra Flash, a cascaded streaming framework that enables real-time high-resolution video generation at 1K and 2K resolutions, achieving approximately 30 FPS and 18 FPS, respectively, on a single GPU. The framework employs a novel T2V-to-TV2V super-resolution training paradigm and a causal streaming latent upsampler to enhance spatiotemporal coherence while maintaining computational efficiency. Extensive experiments validate Ultra Flash's capability to produce ultra-high-resolution streaming video with state-of-the-art visual quality, addressing a critical gap in the current video generation landscape.
Ultra Flash achieves real-time high-resolution video generation at unprecedented frame rates, pushing the boundaries of what鈥檚 possible in streaming video AI.
While recent autoregressive video diffusion models achieve remarkable streaming quality, they remain confined to low resolutions (e.g., 480P), leaving efficient, scalable, real-time high-resolution video generation a fundamental open challenge. To bridge this gap, we present Ultra Flash, a cascaded streaming framework capable of real-time high-resolution video generation. Ultra Flash achieves ~30 FPS at 1K resolution and ~18 FPS at 2K resolution on a single GPU through three key contributions: (1) an architecture-preserving T2V-to-TV2V super-resolution training paradigm coupled with an AIGC-oriented data degradation pipeline that effectively preserves the generative capability of the base model, enabling enhanced high-resolution detail when cascaded after mainstream low-resolution generative models; (2) a causal streaming latent upsampler paired with a high-resolution decoder, which enhances spatiotemporal coherence while enabling efficient latent spatial scaling and precise high-resolution decoding with negligible computational overhead; and (3) a cascade high-resolution streaming video generation optimization scheme that first performs hybrid-reward-enhanced sparse causalization and single-step distillation of the super-resolution model, then introduces cascaded streaming self-forcing preference optimization with dynamic cache management, jointly enhancing overall coherence, improving quality, and enabling real-time high-resolution streaming video generation. Extensive experiments demonstrate that Ultra Flash reliably produces ultra-high-resolution streaming video while maintaining state-of-the-art visual quality and superior efficiency.