DeepMindFeb 26, 2026arXiv:2602.22810

Multi-agent imitation learning with function approximation: Linear Markov games and beyond

Luca Viano, Luca Viano, Till Freihaut, Till Freihaut, Emanuele Nevali, Emanuele Nevali, Volkan Cevher, V. Cevher, Matthieu Geist, Matthieu Geist, Giorgia Ramponi, G. Ramponi

AI Summary

This paper presents a theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games, where transition dynamics and reward functions are linear in given features. It demonstrates that a feature-level concentrability coefficient can replace the state-action level coefficient, leading to improved sample complexity when features are informative. The authors also introduce a computationally efficient interactive MAIL algorithm for linear Markov games with sample complexity dependent only on the feature map dimension, and empirically validate a deep MAIL interactive algorithm on Tic-Tac-Toe and Connect4.

Key Contribution

Forget state-action spaces: this work achieves efficient multi-agent imitation learning by concentrating on feature-level representations in linear Markov games.

Abstract

In this work, we present the first theoretical analysis of multi-agent imitation learning (MAIL) in linear Markov games where both the transition dynamics and each agent's reward function are linear in some given features. We demonstrate that by leveraging this structure, it is possible to replace the state-action level"all policy deviation concentrability coefficient"(Freihaut et al., arXiv:2510.09325) with a concentrability coefficient defined at the feature level which can be much smaller than the state-action analog when the features are informative about states'similarity. Furthermore, to circumvent the need for any concentrability coefficient, we turn to the interactive setting. We provide the first, computationally efficient, interactive MAIL algorithm for linear Markov games and show that its sample complexity depends only on the dimension of the feature map $d$. Building on these theoretical findings, we propose a deep MAIL interactive algorithm which clearly outperforms BC on games such as Tic-Tac-Toe and Connect4.

Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multi-agent imitation learning with function approximation: Linear Markov games and beyond

Related Papers