Google ResearchJul 10, 2025arXiv:2507.07855

DPO Unchained: Your Training Algorithm is Secretly Disentangled in Human Choice Theory

Wenxuan Zhou, Shujian Zhang, B. Magdalou, John Lambert, Ehsan Amid, Richard Nock, Andrew Straiton Hard

AI Summary

This paper generalizes the connection between Direct Preference Optimization (DPO) and human choice theory, extending the normative framework underlying DPO. By reworking the standard human choice theory, the authors demonstrate that any compliant machine learning analytical choice model can be embedded within any human choice model. This generalization supports non-convex losses and provides a unifying framework for various DPO extensions like margins and length correction.

Key Contribution

DPO's success isn't just clever engineering—it's deeply rooted in human choice theory, unlocking a surprisingly flexible framework for preference optimization and justifying many DPO extensions.

Abstract

Normative theories allow one to elicit key parts of a ML algorithm from first principles, which is crucial at a time of championed scrutiny for ML work. Direct Preference Optimization (DPO) cleverly bypasses reward modeling by making an explicit link with a specific normative model of human choice. Our paper elevates this connection to the full generality of DPO's normative framework. Getting there requires reworking human choice theory's textbook path for a better RLHF/ML fit. It elevates the connection to a remarkably broad viewpoint on preference optimization, considering the current panorama of DPO follow-ups. It also unveils unexpected riches for ML, chief among which the support for non-convex losses, the fact that any compliant ML analytical choice can be embedded with any human choice model, and a normative framework's umbrella wide enough to safeguard DPO's extensions (margins, length correction, ...). A toy experiment ``far away''from the DPO crowd is given.

Constitutional AI & AI Ethics Interpretability & Mechanistic Interp RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References62

Year2025

VenueN/A

Related Papers

Finding related papers...

Search

DPO Unchained: Your Training Algorithm is Secretly Disentangled in Human Choice Theory

Related Papers