Martin Ester

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Yuzhen Mao (1)Qitong Wang (1)Ke Li (1)

Papers (1)

Apr 12, 2026

Yuzhen Mao +32w ago

IceCache: Memory-efficient KV-cache Management for Long-Sequence LLMs

LLMs can maintain near-perfect accuracy on long sequences with only 25% of the KV cache, thanks to a novel semantic clustering approach that dramatically improves CPU-GPU offloading.

Yuzhen Mao, Qitong Wang, Martin Ester +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Martin Ester

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)