Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

John Long | Lattice

John Long

SambaNova AI

Papers on Lattice

1

Total citations

0

Topics

3

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)Tool Use & Agents (1)

Frequent co-authors

Shubhangi Upasani (1)Ravi Shanker Raju (1)Mengmeing Ji (1)Urmish Thakker (1)

Papers (1)

Mar 3, 2026

Mar 3, 2026·also Meta AI, Microsoft Research

Cross-Family Speculative Prefill: Training-Free Long-Context Compression with Small Draft Models

Forget same-family constraints: you can compress prompts for LLaMA with a Qwen draft model and still get 90-100% of the original performance.

Shubhangi Upasani, Ravi Shanker Raju, Mengmeing Ji +3

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Tool Use & Agents

Guangtao Wang (1)