Search papers, labs, and topics across Lattice.
This paper introduces a novel framework for on-manifold Shapley-based attribution using optimal generative flows, addressing the off-manifold artifacts common in post-hoc explainable AI. They prove a representation theorem that uniquely characterizes on-manifold Aumann-Shapley attributions via gradient line integrals along kinetic-energy-minimizing Wasserstein-2 geodesics. Experiments demonstrate that this approach achieves better manifold adherence and semantic alignment compared to existing baselines, while also providing provable stability guarantees.
Escape the curse of off-manifold Shapley values: this new method leverages optimal generative flows to produce attributions that actually respect the data manifold.
Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods attempt to address this, they often introduce geometric inefficiency and discretization drift. We propose a formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows. We prove a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance. To resolve path ambiguity, we select the kinetic-energy-minimizing Wasserstein-2 geodesic transporting a prior to the data distribution. This yields a canonical attribution family that recovers classical Shapley for additive models and admits provable stability bounds against flow approximation errors. By reframing baseline selection as a variational problem, our method experimentally outperforms baselines, achieving strict manifold adherence via vanishing Flow Consistency Error and superior semantic alignment characterized by Structure-Aware Total Variation. Our code is on https://github.com/cenweizhang/OTFlowSHAP.