Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Zhiguang Chen | Lattice

Zhiguang Chen

Papers on Lattice

2

Total citations

0

Topics

3

h-index

0

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Distributed Systems & Hardware (2)Inference & Quantization (2)

Frequent co-authors

Xiao Shi (1)Yingying Sun (1)Jiangsu Du (1)Yutong Lu (1)

Papers (2)

Jul 6, 2026

Xiao Shi +41w ago

Communication-Aware Placement and Pruning for Efficient Mixture-of-Experts Inference

CAP achieves up to 86% higher throughput in MoE models while preserving accuracy, revolutionizing how we optimize expert placement and pruning.

Xiao Shi, Yingying Sun, Jiangsu Du +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

May 4, 2026

Hongbin Zhang +4May 4, 2026

PipeMax: Enhancing Offline LLM Inference on Commodity GPU Servers

Commodity GPU servers can achieve surprisingly high LLM inference throughput by cleverly orchestrating pipeline parallelism with KV cache offloading.

Hongbin Zhang, Taosheng Wei, Jiazhi Jiang +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Hongbin Zhang (1)

Taosheng Wei (1)

Jiazhi Jiang (1)