Search papers, labs, and topics across Lattice.
This paper reframes offline model-based optimization (MBO) as a ranking problem, focusing on distinguishing near-optimal designs rather than accurate value prediction via regression. They introduce an optimization-oriented risk based on ranking and develop a theoretical framework connecting surrogate learning to optimization performance. The authors propose a distribution-aware ranking method to mitigate distributional mismatch between training data and near-optimal designs, demonstrating its superiority over existing methods across various tasks.
Offline model-based optimization is fundamentally a ranking problem, and focusing on ranking near-optimal designs beats traditional regression-based surrogate modeling.
Offline model-based optimization (MBO) seeks to discover high-performing designs using only a fixed dataset of past evaluations. Most existing methods rely on learning a surrogate model via regression and implicitly assume that good predictive accuracy leads to good optimization performance. In this work, we challenge this assumption and study offline MBO from a learnability perspective. We argue that offline optimization is fundamentally a problem of ranking high-quality designs rather than accurate value prediction. Specifically, we introduce an optimization-oriented risk based on ranking between near-optimal and suboptimal designs, and develop a unified theoretical framework that connects surrogate learning to final optimization. We prove the theoretical advantages of ranking over regression, and identify distributional mismatch between the training data and near-optimal designs as the dominant error. Inspired by this, we design a distribution-aware ranking method to reduce this mismatch. Empirical results across various tasks show that our approach outperforms twenty existing methods, validating our theoretical findings. Additionally, both theoretical and empirical results reveal intrinsic limitations in offline MBO, showing a regime in which no offline method can avoid over-optimistic extrapolation.