Olga Kogiou

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Oteo Mamo (1)Hyunji Yi (1)Weikuan Yu (1)

Papers (1)

Apr 6, 2026

Oteo Mamo +3Apr 6, 2026

Comparative Characterization of KV Cache Management Strategies for LLM Inference

Stop guessing which KV cache optimization to use: this benchmark reveals exactly when vLLM, InfiniGen, or H2O will give you the best latency, throughput, and memory footprint for your LLM inference workload.

Oteo Mamo, Olga Kogiou, Hyunji Yi +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Olga Kogiou

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)