Mingu Lee

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Inference & Quantization (2)Natural Language Processing (1)Interpretability & Mechanistic Interp (1)

Frequent co-authors

Zongyue Qin (1)Raghavv Goel (1)Mukul Gagrani (1)Risheek Garrepalli (1)

Papers (2)

Mar 9, 2026

1w ago

ConFu: Contemplate the Future for Better Speculative Sampling

By enabling draft models to "contemplate the future," ConFu achieves significant speedups in speculative decoding, outperforming EAGLE-3 by 8-11% on Llama-3 models.

Zongyue Qin, Raghavv Goel, Mukul Gagrani +4

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Mar 8, 2026

Sudhanshu Agrawal +31w ago

Skip to the Good Part: Representation Structure & Inference-Time Layer Skipping in Diffusion vs. Autoregressive LLMs

Diffusion language models have surprisingly redundant early layers, enabling nearly 20% FLOPs reduction at inference time via layer skipping without sacrificing performance.

Sudhanshu Agrawal, Chris Lott, Mingu Lee +1

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Interpretability & Mechanistic Interp

Search

Mingu Lee

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)