Lattice AI Research

Research focus

Distributed Systems & Hardware (2)Training Efficiency & Optimization (1)Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)

Frequent co-authors

Weijian Liu (1)Mingzhen Li (1)Rui Kang (1)Chen Sun (1)

Papers (2)

Jul 6, 2026

Weijian Liu +51w ago

Direct Model State Migration for Elastic Training of Large Language Models

ETC slashes migration latency by up to 6.37 times, transforming how LLMs adapt to dynamic resource environments.

Weijian Liu, Mingzhen Li, Rui Kang +3

Distributed Systems & Hardware Training Efficiency & Optimization

May 13, 2026

May 13, 2026·also D Pareto candidate set

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

Forget static KV cache compression – KVServe dynamically adapts compression strategies to your service context, slashing latency by up to 32.8x in disaggregated LLM serving.

Zedong Liu, Xinyang Ma, Dejun Luo +9

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Guangming Tan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)

Search

Guangming Tan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)