aiXopt GmbHRWTHUniversity of BernUZHJun 1, 2026arXiv:2606.02351

Local Preferential Bayesian Optimization

Johanna Menn, Miriam Kober, Paul Brunzema, David Stenger, Sebastian Trimpe

AI Summary

This paper introduces local preferential Bayesian optimization (PBO) methods that leverage trust-region and derivative-informed local search techniques to optimize high-dimensional problems using pairwise human feedback. By adapting these approaches from traditional Bayesian optimization, the authors demonstrate that local PBO can significantly reduce cumulative regret in complex landscapes with steep optima compared to global preference-based methods. The results indicate that local PBO is particularly effective for real-world applications, such as policy search, where efficient optimization is crucial.

Key Contribution

Local PBO methods can drastically reduce cumulative regret in high-dimensional optimization tasks, outperforming traditional global approaches.

Abstract

Bayesian optimization (BO) is a popular and effective approach for tuning expensive, noisy experiments, but requires the formulation of an explicit objective function. Preferential BO (PBO) removes this requirement by learning from pairwise human feedback, yet existing methods struggle to efficiently optimize beyond low- and medium-dimensional problems due to their global search approaches. We address this limitation by developing a family of local PBO methods that transfer key ideas from high-dimensional BO to the preferential setting. In particular, we introduce local PBO methods which adapt trust-region and derivative-informed local search to pairwise preference feedback, where the latter exploits first- and second-order derivatives of the Laplace-approximated GP posterior. Our benchmark on GP sample paths, standard optimization benchmark functions, and policy-search tasks shows that local PBO methods are especially effective in high-dimensional and complex landscapes with steep optima. Compared with global preference-based baselines, they can substantially reduce cumulative regret, making them particularly useful for real-world preference-based optimization tasks such as policy search.

RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Local Preferential Bayesian Optimization

Related Papers