Éric Moulines

MBZUAI - Mohamed bin Zayed University of Artificial Intelligence (United Arab Emirates), CMAP - Centre de Mathématiques Appliquées de l'Ecole polytechnique (Route de Saclay, 91128 Palaiseau Cedex - France)

Papers on Lattice

Total citations

Topics

h-index

Research focus

RLHF & Preference Learning (1)Training Efficiency & Optimization (1)

Frequent co-authors

D. Tiapkin (1)Daniele Calandriello (1)D. Belomestny (1)Alexey Naumov (1)

Papers (1)

May 26, 2025

DeepMindMay 26, 2025·also ENS, HuggingFace, INRIA, MBZUAI +2

Accelerating Nash Learning from Human Feedback via Mirror Prox

Ditch reward models: Nash Mirror Prox achieves fast, stable convergence to a Nash equilibrium directly from human preferences, sidestepping the limitations of traditional RLHF.

D. Tiapkin, Daniele Calandriello, D. Belomestny +5

RLHF & Preference Learning Training Efficiency & Optimization

Search

Éric Moulines

Research focus

Frequent co-authors

Papers (1)