2026

wDPO: Winsorized Direct Preference Optimization for Robust LLM Alignment

Jilong Liu, Yonghui Yang, Pengyang Shao, Haokai Ma, Wei Qin, Richang Hong

AI Summary

This work shows that robust preference alignment benefits from addressing different noise types with targeted interventions rather than uniform regularization, and proposes wDPO, a robust LLM alignment approach with hierarchical winsorization.

Citation Metrics

Citations0

Influential citations0

References37

Year2026

VenuearXiv.org

Related Papers

Finding related papers...

Search

wDPO: Winsorized Direct Preference Optimization for Robust LLM Alignment

Related Papers