DeepMindFeb 16, 2026arXiv:2602.15206

MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference

Raphaël Baur, Yannick Metz, Maria Gkoulta, Mennatallah El-Assady, Giorgia Ramponi, Thomas Kleine Buening

AI Summary

The paper introduces MAVRL, a Bayesian approach to reward learning that infers a shared latent reward function from heterogeneous feedback types (demonstrations, comparisons, ratings, stops) by modeling each feedback type with an explicit likelihood. This is important because it eliminates the need for manual weighting of different feedback types and avoids reducing them to a common intermediate representation. The authors demonstrate that MAVRL outperforms single-feedback baselines, exploits complementary information, and produces policies robust to environment perturbations, while also providing interpretable uncertainty estimates.

Key Contribution

Stop hand-tuning reward learning losses: MAVRL learns a shared reward function from diverse feedback signals by treating each as a likelihood within a Bayesian inference framework.

Abstract

Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, it remains unclear how to jointly learn reward functions from heterogeneous feedback types such as demonstrations, comparisons, ratings, and stops that provide qualitatively different signals. We address this challenge by formulating reward learning from multiple feedback types as Bayesian inference over a shared latent reward function, where each feedback type contributes information through an explicit likelihood. We introduce a scalable amortized variational inference approach that learns a shared reward encoder and feedback-specific likelihood decoders and is trained by optimizing a single evidence lower bound. Our approach avoids reducing feedback to a common intermediate representation and eliminates the need for manual loss balancing. Across discrete and continuous-control benchmarks, we show that jointly inferred reward posteriors outperform single-type baselines, exploit complementary information across feedback types, and yield policies that are more robust to environment perturbations. The inferred reward uncertainty further provides interpretable signals for analyzing model confidence and consistency across feedback types.

RLHF & Preference Learning Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference

Related Papers