B. Beckerman

Papers on Lattice

Total citations

Topics

h-index

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Igor Fedorov (1)Andrey Gromov (1)Naveen Suda (1)David Eriksson (1)

Papers (1)

Mar 16, 2026

Meta AIMar 16, 2026·also Mila

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale

Forget exotic attention mechanisms – MobileLLM-Flash achieves up to 1.8x faster LLM prefill on mobile CPUs by smartly pruning and adapting existing architectures for on-device use.

Igor Fedorov, Andrey Gromov, B. Beckerman +12

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

B. Beckerman

Research focus

Frequent co-authors

Papers (1)