Search papers, labs, and topics across Lattice.
This paper introduces a benchmark suite of 61 imperfect-recall decision problems, encompassing scenarios like privacy in AI and AI safety, to evaluate algorithms for finding first-order optimal strategies. They propose and analyze a family of regret matching (RM) algorithms adapted for nonlinear constrained optimization, a class previously underexplored beyond two-player zero-sum games. The key result demonstrates that RM algorithms significantly outperform standard first-order optimizers like projected gradient descent in solving these imperfect-recall problems.
Regret matching, the unsung hero of two-player zero-sum games, now dominates first-order optimizers in broader imperfect-recall decision problems, opening new avenues for AI safety and privacy.
In game theory, imperfect-recall decision problems model situations in which an agent forgets information it held before. They encompass games such as the ``absentminded driver'' and team games with limited communication. In this paper, we introduce the first benchmark suite for imperfect-recall decision problems. Our benchmarks capture a variety of problem types, including ones concerning privacy in AI systems that elicit sensitive information, and AI safety via testing of agents in simulation. Across 61 problem instances generated using this suite, we evaluate the performance of different algorithms for finding first-order optimal strategies in such problems. In particular, we introduce the family of regret matching (RM) algorithms for nonlinear constrained optimization. This class of parameter-free algorithms has enjoyed tremendous success in solving large two-player zero-sum games, but, surprisingly, they were hitherto relatively unexplored beyond that setting. Our key finding is that RM algorithms consistently outperform commonly employed first-order optimizers such as projected gradient descent, often by orders of magnitude. This establishes, for the first time, the RM family as a formidable approach to large-scale constrained optimization problems.