BAIRUIUCFeb 26, 2026arXiv:2602.22661

dLLM: Simple Diffusion Language Modeling

Zhanhui Zhou, Zhanhui Zhou, Lingjie Chen, Lingjie Chen, Hanghang Tong, Hanghang Tong, Dawn Song, Dawn Song

AI Summary

The paper introduces dLLM, an open-source framework designed to unify and standardize the core components of diffusion language models (DLMs), including training, inference, and evaluation. dLLM aims to address the lack of unified frameworks and transparent implementations that hinder reproducibility and extension in the rapidly evolving field of DLMs. The framework enables users to reproduce, finetune, deploy, and evaluate existing DLMs, build small DLMs from scratch, and access released checkpoints to accelerate future research.

Key Contribution

Stop struggling with ad-hoc codebases: dLLM offers a unified, open-source framework to reproduce, fine-tune, and build diffusion language models, even from BERT-style encoders.

Abstract

Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures. To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling -- training, inference, and evaluation -- and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline. The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute, including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Open-Source Models & Weights

Citation Metrics

Citations1

Influential citations0

References62

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

dLLM: Simple Diffusion Language Modeling

Related Papers