BeihangGroupMay 26, 2026arXiv:2605.26878

Multi-Stakeholder LLM Alignment: Decomposing Estimation from Aggregation

Lulu Zheng, Wenjin Yang, Xiangwen Zhang, Rong Yin, Yulan Hu, Zheng Pan

AI Summary

This paper identifies that holistic LLM judges conflate utility estimation and aggregation in multi-stakeholder tasks, leading to unstable implicit stakeholder weights and score shifts, especially when stakeholder satisfaction is dispersed. They demonstrate that this "weighting noise" increases with stakeholder count. To address this, they propose \textsc{DecompR}, a method that decouples utility estimation from aggregation by fixing counterfactual-calibrated weights based on query structure before scoring, and estimating per-role utilities independently.

Key Contribution

LLM judges in multi-stakeholder settings suffer from "weighting noise" that gets *worse* as you add more stakeholders, but fixing weights upfront can stabilize the process.

Abstract

Multi-stakeholder tasks require one output to satisfy users with conflicting preferences. Holistic LLM judges conflate utility estimation and utility aggregation, yielding unstable implicit weights. We show empirically and theoretically that this aggregation-specific \emph{weighting noise} can create large score shifts when stakeholder satisfaction is dispersed; in our experiments, these weight-induced shifts also increase with stakeholder count. We propose \textsc{DecompR}: counterfactual-calibrated weights are fixed from query structure before candidate scoring, while per-role utilities are estimated independently, removing candidate-dependent weight drift and reducing estimation noise.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multi-Stakeholder LLM Alignment: Decomposing Estimation from Aggregation

Related Papers