Hao Gu

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)

Frequent co-authors

Xintong Yang (1)Binxing Xu (1)Lujun Li (1)Bei Liu (1)

Papers (1)

May 25, 2026

Xintong Yang +82w ago·also Microsoft Research, Guangzhou City Polytechnic, HKUST

IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference

LLMs can maintain long-context performance even with aggressive KV-cache eviction by learning to predict token importance and compressing evicted tokens into a latent memory.

Xintong Yang, Hao Gu, Binxing Xu +6

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization

Search

Hao Gu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)