EPFLFeb 26, 2026arXiv:2602.23277

Zeroth-Order Stackelberg Control in Combinatorial Congestion Games

Saeed Masiha, Saeed Masiha, Sepehr Elahi, Sepehr Elahi, Negar Kiyavash, N. Kiyavash, Patrick Thiran, Patrick Thiran

AI Summary

The paper introduces ZO-Stackelberg, a zeroth-order optimization method for Stackelberg control in combinatorial congestion games, where the leader optimizes network parameters to minimize a system-level objective given selfish user routing. ZO-Stackelberg combines a projection-free Frank-Wolfe equilibrium solver with a zeroth-order outer update to avoid differentiating through the equilibrium. The authors prove convergence to generalized Goldstein stationary points with explicit dependence on equilibrium approximation error and demonstrate significant speedups compared to differentiation-based baselines on real-world networks.

Key Contribution

Ditching gradients lets you optimize network tolls and incentives in congestion games *much* faster, even when user behavior is complex and equilibrium-based.

Abstract

We study Stackelberg (leader--follower) tuning of network parameters (tolls, capacities, incentives) in combinatorial congestion games, where selfish users choose discrete routes (or other combinatorial strategies) and settle at a congestion equilibrium. The leader minimizes a system-level objective (e.g., total travel time) evaluated at equilibrium, but this objective is typically nonsmooth because the set of used strategies can change abruptly. We propose ZO-Stackelberg, which couples a projection-free Frank--Wolfe equilibrium solver with a zeroth-order outer update, avoiding differentiation through equilibria. We prove convergence to generalized Goldstein stationary points of the true equilibrium objective, with explicit dependence on the equilibrium approximation error, and analyze subsampled oracles: if an exact minimizer is sampled with probability $\kappa_m$, then the Frank--Wolfe error decays as $\mathcal{O}(1/(\kappa_m T))$. We also propose stratified sampling as a practical way to avoid a vanishing $\kappa_m$ when the strategies that matter most for the Wardrop equilibrium concentrate in a few dominant combinatorial classes (e.g., short paths). Experiments on real-world networks demonstrate that our method achieves orders-of-magnitude speedups over a differentiation-based baseline while converging to follower equilibria.

Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References53

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Zeroth-Order Stackelberg Control in Combinatorial Congestion Games

Related Papers