Yonggan Fu

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (3)Tool Use & Agents (2)Inference & Quantization (2)Scaling Laws & Emergent Abilities (1)

Frequent co-authors

Aaron Blakeman (2)Abhibha Gupta (2)Abhinav Khattar (2)Adil Asif (2)

Papers (3)

Jun 12, 2026

AI21w ago·also NVIDIA, Institute of Medical Technology, PKU, Waterloo

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Achieving six times the inference throughput of current LLMs while maintaining accuracy, Nemotron 3 Ultra redefines performance benchmarks for agentic reasoning tasks.

NVIDIA, Aaron Blakeman, Aaron Thomas +537

Architecture Design (Transformers, SSMs, MoE)Scaling Laws & Emergent Abilities Tool Use & Agents

Apr 21, 2026

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

Ditch the slow lane: $R^2$-dLLM turbocharges diffusion language models by slashing decoding steps by up to 75% without sacrificing quality.

Zhenbang Du, Kejing Xia, Xinrui Zhong +6

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Apr 14, 2026

AI2Apr 14, 2026·also NVIDIA, BIT, Waterloo

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Nemotron 3 Super proves you can achieve comparable accuracy to existing 120B models, but with significantly higher inference throughput, by combining Mamba, Attention, and Mixture-of-Experts.

Aakshita Chandiramani, Aaron Blakeman, Abdullahi Olaoye +448

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Tool Use & Agents

Search

Yonggan Fu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)