Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Yan-Hong Cui | Lattice

Yan-Hong Cui

Papers on Lattice

1

Total citations

4

Topics

3

h-index

2

Research focus

Distributed Systems & Hardware (1)Inference & Quantization (1)Open-Source Models & Weights (1)

Frequent co-authors

Bangsheng Tang (1)Carl Chengyan Fu (1)Fei Kou (1)Grigory Sizov (1)

Papers (1)

Aug 11, 2025

Aug 11, 2025·also Accenture, NUDT

Efficient Speculative Decoding for Llama at Scale: Challenges and Solutions

Speculative decoding for Llama just got 10% faster, thanks to production-scale optimizations that unlock new levels of inference efficiency.

Bangsheng Tang, Carl Chengyan Fu, Fei Kou +35

Distributed Systems & Hardware Inference & Quantization Open-Source Models & Weights

Haoci Zhang (1)