Jiahao Fang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Sean Nian (1)Qilong Feng (1)Zhiyu Wu (1)Fan Lai (1)

Papers (1)

Apr 28, 2026

Sean Nian +43w ago

CacheFlow: Efficient LLM Serving with 3D-Parallel KV Cache Restoration

CacheFlow slashes LLM serving latency by up to 62% by rethinking KV cache restoration as a 3D-parallel scheduling problem, not just a recompute vs. I/O tradeoff.

Sean Nian, Jiahao Fang, Qilong Feng +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Jiahao Fang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)