Apr 22, 2026arXiv:2604.20109

Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning

Yicheng Pan, Ruisong Zhou, Haijun Zou, Tianyou Li, Zaiwen Wen

AI Summary

This paper introduces PLMA, a permutation learning framework for solving the Quadratic Assignment Problem (QAP) that uses warm-started Markov Chain Monte Carlo (MCMC) finetuning to improve performance. PLMA employs an additive energy-based model (EBM) parameterized by a neural network with cross-graph attention, enabling efficient $O(1)$ 2-swap Metropolis-Hastings sampling. Experiments show PLMA outperforms state-of-the-art baselines on QAPLIB and Taixxeyy instances, demonstrating its effectiveness and robustness.

Key Contribution

Solving NP-hard combinatorial optimization problems like QAP just got a whole lot faster, thanks to a novel MCMC finetuning approach that achieves near-zero optimality gaps.

Abstract

The quadratic assignment problem (QAP) is a fundamental NP-hard task that poses significant challenges for both traditional heuristics and modern learning-based solvers. Existing QAP solvers still struggle to achieve consistently competitive performance across structurally diverse real-world instances. To bridge this performance gap, we propose PLMA, an innovative permutation learning framework. PLMA features an efficient warm-started MCMC finetuning procedure to enhance deployment-time performance, leveraging short Markov chains to anchor the adaptation to the promising regions previously explored. For rapid exploration via MCMC over the permutation space, we design an additive energy-based model (EBM) that enables an $O(1)$-time 2-swap Metropolis-Hastings sampling step. Moreover, the neural network used to parameterize the EBM incorporates a scalable and flexible cross-graph attention mechanism to model interactions between facilities and locations in the QAP. Extensive experiments demonstrate that PLMA consistently outperforms state-of-the-art baselines across various benchmarks. In particular, PLMA achieves a near-zero average optimality gap on QAPLIB, exhibits remarkably superior robustness on the notoriously difficult Taixxeyy instances, and also serves as an effective QAP solver in bandwidth minimization.

Reasoning & Chain-of-Thought Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning

Related Papers