HKUTexas A&MApr 13, 2026arXiv:2604.11089

Structured State-Space Regularization for Compact and Generation-Friendly Image Tokenization

Jinsung Lee, Jaemin Oh, Namhun Kim, Dongwon Kim, Suha Kwak

AI Summary

This paper introduces a novel regularization technique for image tokenizers that encourages latent spaces to be both compact and generation-friendly. The method aligns latent spaces with the hidden state dynamics of State-Space Models (SSMs), transferring their frequency awareness property to the latent features. By enforcing the encoding of fine spatial structures and frequency-domain cues, the regularizer leads to more effective representation and improved generative modelability, demonstrated through improved generation quality in diffusion models.

Key Contribution

Stealing frequency-awareness from State-Space Models lets image tokenizers generate higher-quality images without sacrificing compression.

Abstract

Image tokenizers are central to modern vision models as they often operate in latent spaces. An ideal latent space must be simultaneously compact and generation-friendly: it should capture image's essential content compactly while remaining easy to model with generative approaches. In this work, we introduce a novel regularizer to align latent spaces with these two objectives. The key idea is to guide tokenizers to mimic the hidden state dynamics of state-space models (SSMs), thereby transferring their critical property, frequency awareness, to latent features. Grounded in a theoretical analysis of SSMs, our regularizer enforces encoding of fine spatial structures and frequency-domain cues into compact latent features; leading to more effective use of representation capacity and improved generative modelability. Experiments demonstrate that our method improves generation quality in diffusion models while incurring only minimal loss in reconstruction fidelity.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Structured State-Space Regularization for Compact and Generation-Friendly Image Tokenization

Related Papers