HITApr 21, 2026arXiv:2604.18982

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

Xiachong Feng, Yilei Jiang, Xiaocheng Feng, Deyi Yin, Libo Qin, Yangfan Ye, Lei Huang, Weitao Ma, Yuxuan Gu, Chonghan Qin, Bing Qin, Lingpeng Kong

AI Summary

The paper introduces SAVOIR, a reinforcement learning framework for training socially intelligent language agents that uses Shapley values to address the credit assignment problem in multi-turn dialogues. SAVOIR combines expected utility shifts for prospective valuation of utterances with Shapley values for fair reward distribution, providing axiomatic guarantees. Experiments on the SOTOPIA benchmark show that SAVOIR achieves state-of-the-art performance, even surpassing GPT-4o and Claude-3.5-Sonnet, and highlighting the distinct capabilities needed for social intelligence compared to analytical reasoning.

Key Contribution

Social intelligence may require more than just reasoning power: a 7B model trained with SAVOIR beats GPT-4o and Claude-3.5-Sonnet on social interaction tasks.

Abstract

Social intelligence, the ability to navigate complex interpersonal interactions, presents a fundamental challenge for language agents. Training such agents via reinforcement learning requires solving the credit assignment problem: determining how individual utterances contribute to multi-turn dialogue outcomes. Existing approaches directly employ language models to distribute episode-level rewards, yielding attributions that are retrospective and lack theoretical grounding. We propose SAVOIR (ShApley Value fOr SocIal RL), a novel principled framework grounded in cooperative game theory. Our approach combines two complementary principles: expected utility shifts evaluation from retrospective attribution to prospective valuation, capturing an utterance's strategic potential for enabling favorable future trajectories; Shapley values ensure fair credit distribution with axiomatic guarantees of efficiency, symmetry, and marginality. Experiments on the SOTOPIA benchmark demonstrate that SAVOIR achieves new state-of-the-art performance across all evaluation settings, with our 7B model matching or exceeding proprietary models including GPT-4o and Claude-3.5-Sonnet. Notably, even large reasoning models consistently underperform, suggesting social intelligence requires qualitatively different capabilities than analytical reasoning.

Natural Language Processing RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

Related Papers