Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Michael W. Mahoney | Lattice

Michael W. Mahoney

Papers on Lattice

2

Total citations

10

Topics

4

h-index

9

Research focus

Inference & Quantization (2)Eval Frameworks & Benchmarks (1)Tool Use & Agents (1)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Kurt Keutzer (2)Amir Gholami (2)Lutfi Eren Erdogan (1)Chris Joseph John (1)

Papers (2)

Feb 12, 2026

BAIRFeb 12, 2026

Agentic Test-Time Scaling for WebAgents

Uncertainty-driven dynamic compute allocation lets web agents outperform naive test-time scaling by 9.1% while using 2.3x fewer tokens.

Lutfi Eren Erdogan, Chris Joseph John, Surya Krishnapillai +3

Eval Frameworks & Benchmarks Inference & Quantization Tool Use & Agents

Feb 5, 2025

Rishabh Tiwari +9Feb 5, 2025·also BAIR

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

Forget sparse KV caches – QuantSpec's hierarchical 4-bit quantization unlocks 2.5x speedups in long-context LLM inference with >90% acceptance rates.

Rishabh Tiwari, Haocheng Xi, Aditya Tomar +710

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization

Surya Krishnapillai (1)

Rishabh Tiwari (1)

Haocheng Xi (1)

Aditya Tomar (1)