Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Zhaoning Zhang | Lattice

Zhaoning Zhang

Papers on Lattice

1

Total citations

0

Topics

3

h-index

2

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Baihui Liu (1)Kaiyuan Tian (1)Linbo Qiao (1)Dongsheng Li (1)

Papers (1)

Apr 9, 2026

Baihui Liu +4Apr 9, 2026

Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference

Squeeze 34% more decode speed out of your MoE model without sacrificing accuracy by intelligently budgeting expert activations.

Baihui Liu, Kaiyuan Tian, Zhaoning Zhang +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization