Search papers, labs, and topics across Lattice.
The paper introduces a flow-based language model (FLM) that performs Euclidean denoising over one-hot token encodings, challenging the necessity of discrete diffusion for discrete data generation. FLM is trained using a cross-entropy objective with a novel time reparameterization for improved stability and quality. By distilling FLM into a distilled flow map language model (FMLM), the authors achieve state-of-the-art few-step generation, surpassing discrete diffusion models in both quality and speed on LM1B and OWT datasets.
Forget slow, multi-step diffusion: this work achieves state-of-the-art text generation quality with a *single* denoising step using flow-based language models.
Language models based on discrete diffusion have attracted widespread interest for their potential to provide faster generation than autoregressive models. In practice, however, they exhibit a sharp degradation of sample quality in the few-step regime, failing to realize this promise. Here we show that language models leveraging flow-based continuous denoising can outperform discrete diffusion in both quality and speed. By revisiting the fundamentals of flows over discrete modalities, we build a flow-based language model (FLM) that performs Euclidean denoising over one-hot token encodings. We show that the model can be trained by predicting the clean data via a cross entropy objective, where we introduce a simple time reparameterization that greatly improves training stability and generation quality. By distilling FLM into its associated flow map, we obtain a distilled flow map language model (FMLM) capable of few-step generation. On the LM1B and OWT language datasets, FLM attains generation quality matching state-of-the-art discrete diffusion models. With FMLM, our approach outperforms recent few-step language models across the board, with one-step generation exceeding their 8-step quality. Our work calls into question the widely held hypothesis that discrete diffusion processes are necessary for generative modeling over discrete modalities, and paves the way toward accelerated flow-based language modeling at scale. Code is available at https://github.com/david3684/flm.