Chen Chen

X. Yan, C. Chen, and X. Li are with Department of Automation, Tsinghua University

Tsinghua AI

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Distributed Systems & Hardware (2)Inference & Quantization (2)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Zhibin Yu (1)Quan Chen (1)Minyi Guo (1)Zhenghao Gan (1)

Papers (2)

Mar 11, 2026

Tsinghua AIMar 11, 2026·also SJTU

S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load Balance

Exploit the surprisingly stable, yet heterogeneous, sparsity patterns across attention heads to slash LLM attention latency by 2.88x without sacrificing quality.

Chen Chen, Zhibin Yu, Quan Chen +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Mar 9, 2026

Tsinghua AIMar 9, 2026

SageSched: Efficient LLM Scheduling Confronting Demand Uncertainty and Hybridity

Beat the LLM inference bottleneck: SageSched's uncertainty-aware scheduling boosts efficiency by nearly 30% by predicting output length and balancing compute and memory demands.

Zhenghao Gan, Z. Gan, Yichen Bao +4

Distributed Systems & Hardware Inference & Quantization

Search

Chen Chen

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)