NUSApr 9, 2026arXiv:2604.08438

Provably Adaptive Linear Approximation for the Shapley Value and Beyond

Weida Li, Weida Li, Yaoliang Yu, Yaoliang Yu, B. Low, Bryan Kian Hsiang Low

AI Summary

This paper investigates the efficient approximation of Shapley values and semi-values under a linear space constraint, addressing a key challenge in large-scale attribution problems. They develop a theoretical framework based on vector concentration inequalities to derive sharper query complexities for existing unbiased randomized algorithms. The authors introduce Adalina, the first adaptive, linear-time, linear-space randomized algorithm that provably achieves improved mean square error in approximating Shapley values.

Key Contribution

Forget exponential complexity: Adalina slashes the query complexity for approximating Shapley values with a provably adaptive, linear-time, linear-space algorithm.

Abstract

The Shapley value, and its broader family of semi-values, has received much attention in various attribution problems. A fundamental and long-standing challenge is their efficient approximation, since exact computation generally requires an exponential number of utility queries in the number of players $n$. To meet the challenges of large-scale applications, we explore the limits of efficiently approximating semi-values under a $\Theta(n)$ space constraint. Building upon a vector concentration inequality, we establish a theoretical framework that enables sharper query complexities for existing unbiased randomized algorithms. Within this framework, we systematically develop a linear-space algorithm that requires $O(\frac{n}{\epsilon^{2}}\log\frac{1}{\delta})$ utility queries to ensure $P(\|\hat{\boldsymbol\phi}-\boldsymbol\phi\|_{2}\geq\epsilon)\leq \delta$ for all commonly used semi-values. In particular, our framework naturally bridges OFA, unbiased kernelSHAP, SHAP-IQ and the regression-adjusted approach, and definitively characterizes when paired sampling is beneficial. Moreover, our algorithm allows explicit minimization of the mean square error for each specific utility function. Accordingly, we introduce the first adaptive, linear-time, linear-space randomized algorithm, Adalina, that theoretically achieves improved mean square error. All of our theoretical findings are experimentally validated.

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought Scalable Oversight & Alignment Theory

Citation Metrics

Citations0

Influential citations0

References36

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Provably Adaptive Linear Approximation for the Shapley Value and Beyond

Related Papers