Marco Cuturi

Papers on Lattice

Total citations

Topics

h-index

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)Multimodal Models (1)Scaling Laws & Emergent Abilities (1)

Frequent co-authors

Pierre Ablin (2)Anastasiia Filippova (1)David Grangier (1)Joao Monteiro (1)

Papers (3)

Apr 3, 2026

Apple MLApr 3, 2026

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

Forget full KV caches: randomly routing attention across layers during training lets you drastically cut memory without hurting performance, and sometimes even helps.

Anastasiia Filippova, David Grangier, Marco Cuturi +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Feb 25, 2026

DeepMindFeb 25, 2026·also Apple ML, Berkeley University, Institut National de la Recherche, UPF

The Design Space of Tri-Modal Masked Diffusion Models

Tri-modal masked diffusion models can now be trained from scratch, achieving strong results in text generation, text-to-image, and text-to-speech, thanks to a systematic exploration of the design space and a novel SDE-based batch size reparameterization.

Louis Bethune, L. Béthune, Victor Turrisi +42

Multimodal Models Scaling Laws & Emergent Abilities Speech & Audio

Feb 12, 2026

Szilvia Ujv'ary +5Feb 12, 2026

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

Factually incorrect SLMs can be made more truthful by teaching them *what* to delegate, not just minimizing loss.

Szilvia Ujv'ary, Louis B'ethune, Pierre Ablin +3

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Tool Use & Agents

Search

Marco Cuturi

Research focus

Frequent co-authors

Papers (3)