Lattice AI Research

Research focus

Inference & Quantization (2)Natural Language Processing (2)Training Efficiency & Optimization (1)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Mukul Gagrani (2)Mingu Lee (1)Christopher M. Lott (1)Chris Lott (1)

Papers (2)

Mar 18, 2026

Raghavv Goel +4Mar 18, 2026

Efficient Training-Free Multi-Token Prediction via Embedding-Space Probing

LLMs can predict multiple tokens in parallel without any training, simply by cleverly probing their embedding space with dynamically generated mask tokens.

Raghavv Goel, Mukul Gagrani, Mingu Lee +2

Inference & Quantization Natural Language Processing Training Efficiency & Optimization

Mar 9, 2026

ConFu: Contemplate the Future for Better Speculative Sampling

By enabling draft models to "contemplate the future," ConFu achieves significant speedups in speculative decoding, outperforming EAGLE-3 by 8-11% on Llama-3 models.

Zongyue Qin, Raghavv Goel, Mukul Gagrani +3

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Search

Raghavv Goel

Research focus

Frequent co-authors

Papers (2)