Pierre Ménard

Papers on Lattice

Total citations

Topics

h-index

Research focus

Training Efficiency & Optimization (3)World Models & Planning (1)Natural Language Processing (1)RLHF & Preference Learning (1)Recommendation & Information Retrieval (1)

Frequent co-authors

Michal Valko (4)Jean-Bastien Grill (1)Omar Darwiche Domingues (1)Rémi Munos (1)

Papers (4)

Apr 21, 2026

DeepMindApr 21, 2026·also Meta AI, CERMICS École des Ponts ParisTech, INRIA, Institut universitaire de France +4

Planning in entropy-regularized Markov decision processes and games

Entropy regularization makes planning provably easy: SmoothCruiser achieves polynomial sample complexity in MDPs where standard methods fail.

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Ménard +2

Training Efficiency & Optimization World Models & Planning

Apr 16, 2026

Apr 16, 2026·also ENSAE Paris -CREST, Paris-Saclay

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier

Log-barrier regularization unlocks optimal O-tilde(t^{-1/4}) last-iterate convergence in uncoupled matrix games with bandit feedback, finally closing the gap to the theoretical limit.

Côme Fiegel, Come Fiegel, Pierre Ménard +4

Natural Language Processing Training Efficiency & Optimization

May 26, 2025

DeepMindMay 26, 2025·also ENS, HuggingFace, INRIA, MBZUAI +2

Accelerating Nash Learning from Human Feedback via Mirror Prox

Ditch reward models: Nash Mirror Prox achieves fast, stable convergence to a Nash equilibrium directly from human preferences, sidestepping the limitations of traditional RLHF.

D. Tiapkin, Daniele Calandriello, D. Belomestny +5

RLHF & Preference Learning Training Efficiency & Optimization

Jun 3, 2020

Jun 3, 2020·also Paris-Saclay

A single algorithm for both restless and rested rotting bandits

A single algorithm now solves both rested and restless rotting bandits, problems previously thought to require fundamentally different approaches.

Julien Seznec, Pierre Ménard, A. Lazaric +127

Recommendation & Information Retrieval

Search

Pierre Ménard

Research focus

Frequent co-authors

Papers (4)