Search papers, labs, and topics across Lattice.
This paper introduces local preferential Bayesian optimization (PBO) methods that leverage trust-region and derivative-informed local search techniques to optimize high-dimensional problems using pairwise human feedback. By adapting these approaches from traditional Bayesian optimization, the authors demonstrate that local PBO can significantly reduce cumulative regret in complex landscapes with steep optima compared to global preference-based methods. The results indicate that local PBO is particularly effective for real-world applications, such as policy search, where efficient optimization is crucial.
Local PBO methods can drastically reduce cumulative regret in high-dimensional optimization tasks, outperforming traditional global approaches.
Bayesian optimization (BO) is a popular and effective approach for tuning expensive, noisy experiments, but requires the formulation of an explicit objective function. Preferential BO (PBO) removes this requirement by learning from pairwise human feedback, yet existing methods struggle to efficiently optimize beyond low- and medium-dimensional problems due to their global search approaches. We address this limitation by developing a family of local PBO methods that transfer key ideas from high-dimensional BO to the preferential setting. In particular, we introduce local PBO methods which adapt trust-region and derivative-informed local search to pairwise preference feedback, where the latter exploits first- and second-order derivatives of the Laplace-approximated GP posterior. Our benchmark on GP sample paths, standard optimization benchmark functions, and policy-search tasks shows that local PBO methods are especially effective in high-dimensional and complex landscapes with steep optima. Compared with global preference-based baselines, they can substantially reduce cumulative regret, making them particularly useful for real-world preference-based optimization tasks such as policy search.