Search papers, labs, and topics across Lattice.
This paper provides an axiomatic characterization of feature attribution for multi-output predictors using the Shapley framework, demonstrating that any attribution rule satisfying efficiency, symmetry, dummy player, and additivity must decompose component-wise across outputs. The work addresses the lack of theoretical justification for computing SHAP explanations independently for each output coordinate in multi-output models. The key result is a rigidity theorem proving that joint-output attribution rules must relax at least one of the classical Shapley axioms to avoid component-wise decomposition.
Shapley-based feature attribution for multi-output models *must* decompose component-wise across outputs to satisfy standard fairness axioms, revealing a fundamental constraint on joint-output explanations.
In this article, we provide an axiomatic characterization of feature attribution for multi-output predictors within the Shapley framework. While SHAP explanations are routinely computed independently for each output coordinate, the theoretical necessity of this practice has remained unclear. By extending the classical Shapley axioms to vector-valued cooperative games, we establish a rigidity theorem showing that any attribution rule satisfying efficiency, symmetry, dummy player, and additivity must necessarily decompose component-wise across outputs. Consequently, any joint-output attribution rule must relax at least one of the classical Shapley axioms. This result identifies a previously unformalized structural constraint in Shapley-based interpretability, clarifying the precise scope of fairness-consistent explanations in multi-output learning. Numerical experiments on a biomedical benchmark illustrate that multi-output models can yield computational savings in training and deployment, while producing SHAP explanations that remain fully consistent with the component-wise structure imposed by the Shapley axioms.