Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Zhengru Fang | Lattice

Zhengru Fang

Papers on Lattice

1

Total citations

0

Topics

3

h-index

11

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Hongyao Liu (1)L. Zhai (1)Junyi Wang (1)

Papers (1)

Apr 23, 2026

Hongyao Liu +32d ago

SparKV: Overhead-Aware KV Cache Loading for Efficient On-Device LLM Inference

On-device LLM inference gets a massive speed and energy boost by adaptively streaming only the most expensive parts of the KV cache from the cloud.

Hongyao Liu, L. Zhai, Junyi Wang +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization