Mar 8, 2026arXiv:2603.07455

Image Generation Models: A Technical History

AI Summary

This paper surveys the landscape of image generation models, providing a technical overview of VAEs, GANs, normalizing flows, autoregressive models, transformers, and diffusion models. It details the objectives, architectures, training, optimization, and limitations of each model type, while also covering recent advances in video generation. The survey concludes with a discussion of robustness, responsible deployment, and deepfake risks associated with these models.

Key Contribution

Untangle the complex web of image generation models with this comprehensive technical history, spanning VAEs to diffusion models and highlighting failure modes along the way.

Abstract

Image generation has advanced rapidly over the past decade, yet the literature seems fragmented across different models and application domains. This paper aims to offer a comprehensive survey of breakthrough image generation models, including variational autoencoders (VAEs), generative adversarial networks (GANs), normalizing flows, autoregressive and transformer-based generators, and diffusion-based methods. We provide a detailed technical walkthrough of each model type, including their underlying objectives, architectural building blocks, and algorithmic training steps. For each model type, we present the optimization techniques as well as common failure modes and limitations. We also go over recent developments in video generation and present the research works that made it possible to go from still frames to high quality videos. Lastly, we cover the growing importance of robustness and responsible deployment of these models, including deepfake risks, detection, artifacts, and watermarking.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Image Generation Models: A Technical History

Related Papers