Search papers, labs, and topics across Lattice.
This paper introduces ProxySHAP, a novel method for efficiently approximating Shapley and Banzhaf interaction indices in machine learning models. It leverages tree-based proxy models and residual correction to achieve a better trade-off between speed and accuracy compared to existing estimators. The method includes a polynomial-time generalization of interventional TreeSHAP and a formal analysis of residual adjustment, demonstrating state-of-the-art performance in both small- and large-scale applications.
ProxySHAP slashes the computational cost of Shapley interaction estimation while simultaneously boosting accuracy, finally making high-order interaction analysis practical for models with thousands of features.
Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.