Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Ofir Ben Shoham | Lattice

Ofir Ben Shoham

Papers on Lattice

1

Total citations

0

Topics

3

h-index

4

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)Natural Language Processing (1)

Papers (1)

Mar 5, 2026

Ofir Ben Shoham1w ago

Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding

Shrinking draft model vocabularies by up to 97% can significantly boost speculative decoding throughput, especially for domain-specific tasks.

Ofir Ben Shoham

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing