Apr 23, 2026arXiv:2604.21395

Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair

AI Summary

This paper proves that empirical risk minimization (ERM) enforces a geometric constraint where learned representations must retain Jacobian sensitivity to label-correlated but nuisance features, a phenomenon termed the "geometric blind spot." This blind spot, a mathematical consequence of supervised learning, explains vulnerabilities like adversarial examples, texture bias, and robustness-accuracy tradeoffs. The authors introduce the Trajectory Deviation Index (TDI) to measure this blind spot and demonstrate its presence in various models, showing it can be repaired with a specific perturbation law.

Key Contribution

Supervised learning is fundamentally flawed: models *must* retain sensitivity to irrelevant features, opening the door to adversarial attacks and other vulnerabilities.

Abstract

We prove that empirical risk minimisation (ERM) imposes a necessary geometric constraint on learned representations: any encoder that minimises supervised loss must retain non-zero Jacobian sensitivity in directions that are label-correlated in training data but nuisance at test time. This is not a contingent failure of current methods; it is a mathematical consequence of the supervised objective itself. We call this the geometric blind spot of supervised learning (Theorem 1), and show it holds across proper scoring rules, architectures, and dataset sizes. This single theorem unifies four lines of prior empirical work that were previously treated separately: non-robust predictive features, texture bias, corruption fragility, and the robustness-accuracy tradeoff. In this framing, adversarial vulnerability is one consequence of a broader structural fact about supervised learning geometry. We introduce Trajectory Deviation Index (TDI), a diagnostic that measures the theorem's bounded quantity directly, and show why common alternatives miss the key failure mode. PGD adversarial training reaches Jacobian Frobenius 2.91 yet has the worst clean-input geometry (TDI 1.336), while PMH achieves TDI 0.904. TDI is the only metric that detects this dissociation because it measures isotropic path-length distortion -- the exact quantity Theorem 1 bounds. Across seven vision tasks, BERT/SST-2, and ImageNet ViT-B/16 backbones used by CLIP, DINO, and SAM, the blind spot is measurable and repairable. It is present at foundation-model scale, worsens monotonically across language-model sizes (blind-spot ratio 0.860 to 0.765 to 0.742 from 66M to 340M), and is amplified by task-specific ERM fine-tuning (+54%), while PMH repairs it by 11x with one additional training term whose Gaussian form Proposition 5 proves is the unique perturbation law that uniformly penalises the encoder Jacobian.

Constitutional AI & AI Ethics Interpretability & Mechanistic Interp Scalable Oversight & Alignment Theory

Citation Metrics

Citations0

Influential citations0

References23

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair

Related Papers