Xinrui Zhong

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Scalable Oversight & Alignment Theory (1)Tool Use & Agents (1)Inference & Quantization (1)

Frequent co-authors

Zhuoming Chen (1)Qilong Feng (1)Ranajoy Sadhukhan (1)Michael Qizhe Shieh (1)

Papers (2)

Jun 4, 2026

CMU ML2w ago·also NUS, Rice

Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents

Vortex achieves up to 4.7 times higher throughput for large language models, revolutionizing how researchers can prototype and evaluate sparse attention algorithms.

Zhuoming Chen, Xinrui Zhong, Qilong Feng +4

Architecture Design (Transformers, SSMs, MoE)Scalable Oversight & Alignment Theory Tool Use & Agents

Apr 21, 2026

$R^2$-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction

Ditch the slow lane: $R^2$-dLLM turbocharges diffusion language models by slashing decoding steps by up to 75% without sacrificing quality.

Zhenbang Du, Kejing Xia, Xinrui Zhong +6

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Search

Xinrui Zhong

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)