Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Xiao-Xia Chu | Lattice

Xiao-Xia Chu

Papers on Lattice

1

Total citations

0

Topics

3

h-index

3

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Ruibo Fan (1)Xiangrui Yu (1)Xinglin Pan (1)Zeyu Li (1)

Papers (1)

Mar 18, 2026

Mar 18, 2026·also HIT

ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression

Lossless compression can actually *speed up* LLM inference on GPUs, not just shrink model size, thanks to ZipServ's hardware-aware design.

Ruibo Fan, Xiangrui Yu, Xinglin Pan +5

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Weile Luo (1)

Xiaowen Chu (1)