Apr 22, 2026arXiv:2604.20079

On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks

A. Gupta, Gururaj Deshpande, Chandreyi Chakraborty

AI Summary

This paper investigates post-training quantization (PTQ) techniques, GPTQ and Hessian-Aware Quantization (HAWQ), on a diffusion-based coding LLM (CoDA). They find that CoDA exhibits greater robustness to quantization at low bitwidths (2-4 bits) compared to the auto-regressive Qwen3-1.7B, with smaller accuracy degradation on HumanEval and MBPP benchmarks. Mixed-precision configurations further enable accuracy/latency/memory trade-offs.

Key Contribution

Diffusion language models withstand aggressive quantization better than autoregressive models, suggesting a path to efficient deployment.

Abstract

Auto-regressive Large Language Models (LLMs) achieve strong performance on coding tasks, but incur high memory and inference costs. Diffusion-based language models (d-LLMs) offer bounded inference cost via iterative denoising, but their behavior under post-training quantization (PTQ) has been sparsely explored. We investigate the application and robustness of PTQ techniques, specifically GPTQ and a modified Hessian-Aware Quantization (HAWQ) algorithm, on a diffusion-based coding LLM (CoDA) and observe that these methods applied to CoDA exhibit greater robustness at low bitwidths compared to Qwen3-1.7B, its auto-regressive counterpart, under a standardized evaluation pipeline. We find that in our setup, CoDA exhibits greater robustness at low bitwidths (2-4 bits), with smaller accuracy degradation across HumanEval and MBPP benchmarks. Additionally, mixed-precision configurations derived from HAWQ provide smooth trade-offs across accuracy, latency, and memory. The results suggest that diffusion LLMs may offer advantages for efficient deployment due to more quantization-resilience.

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References47

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks

Related Papers