Feb 17, 2026arXiv:2602.15407

Fairness over Equality: Correcting Social Incentives in Asymmetric Sequential Social Dilemmas

Alper Demir, Hüseyin Aydın, Kale-ab Abebe Tessera, David Abel, Stefano V. Albrecht

AI Summary

This paper investigates the limitations of existing fairness-based methods in Multi-Agent Reinforcement Learning (MARL) when applied to asymmetric Sequential Social Dilemmas (SSDs), where agents have different incentives. It finds that enforcing raw equality in such scenarios can incentivize defection. To address this, the authors propose modifications to fairness definitions by accounting for reward ranges, introducing agent-based weighting, and localizing social feedback. The proposed method demonstrates improved cooperation in asymmetric SSDs compared to existing approaches, while maintaining scalability and practicality.

Key Contribution

Naive fairness objectives in multi-agent reinforcement learning can backfire in asymmetric social dilemmas, but a few simple tweaks can restore cooperation.

Abstract

Sequential Social Dilemmas (SSDs) provide a key framework for studying how cooperation emerges when individual incentives conflict with collective welfare. In Multi-Agent Reinforcement Learning, these problems are often addressed by incorporating intrinsic drives that encourage prosocial or fair behavior. However, most existing methods assume that agents face identical incentives in the dilemma and require continuous access to global information about other agents to assess fairness. In this work, we introduce asymmetric variants of well-known SSD environments and examine how natural differences between agents influence cooperation dynamics. Our findings reveal that existing fairness-based methods struggle to adapt under asymmetric conditions by enforcing raw equality that wrongfully incentivize defection. To address this, we propose three modifications: (i) redefining fairness by accounting for agents' reward ranges, (ii) introducing an agent-based weighting mechanism to better handle inherent asymmetries, and (iii) localizing social feedback to make the methods effective under partial observability without requiring global information sharing. Experimental results show that in asymmetric scenarios, our method fosters faster emergence of cooperative policies compared to existing approaches, without sacrificing scalability or practicality.

Constitutional AI & AI Ethics RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Fairness over Equality: Correcting Social Incentives in Asymmetric Sequential Social Dilemmas

Related Papers