Quanquan Gu

Papers on Lattice

Total citations

Topics

h-index

Frequent co-authors

Yongtao Wu (1)Luca Viano (1)Kimon Antonakopoulos (1)Yihang Chen (1)Zhenyu Zhu (1)

Papers (1)

2026

Yongtao Wu +62026

Multi-Step Alignment as Markov Games: An Optimistic Online Mirror Descent Approach with Convergence Guarantees

Optimistic Multi-step Preference Optimization is built upon the optimistic online mirror descent algorithm and provides a rigorous analysis for the convergence of OMPO and shows that OMPO requires O ( ϵ − 1 ) policy updates to converge to an ϵ -approximate Nash equilibrium.

Yongtao Wu, Luca Viano, Kimon Antonakopoulos +4

Search

Quanquan Gu

Frequent co-authors

Papers (1)