Mar 3, 2026arXiv:2603.02622

Implicit Bias in Deep Linear Discriminant Analysis

AI Summary

This paper provides a theoretical analysis of the implicit regularization induced by Deep Linear Discriminant Analysis (Deep LDA), a scale-invariant objective that minimizes intraclass variance and maximizes interclass distance. The analysis focuses on the gradient flow of the Deep LDA loss on an L-layer diagonal linear network. The key result demonstrates that under balanced initialization, the network transforms additive gradient updates into multiplicative weight updates, leading to automatic conservation of the (2/L) quasi-norm.

Key Contribution

Deep LDA implicitly conserves a (2/L) quasi-norm in deep linear networks, revealing a novel form of implicit regularization in discriminative metric learning.

Abstract

While the Implicit Bias(or Implicit Regularization) of standard loss functions has been studied, the optimization geometry induced by discriminative metric-learning objectives remains largely unexplored.To the best of our knowledge, this paper presents an initial theoretical analysis of the implicit regularization induced by the Deep LDA,a scale invariant objective designed to minimize intraclass variance and maximize interclass distance. By analyzing the gradient flow of the loss on a L-layer diagonal linear network, we prove that under balanced initialization, the network architecture transforms standard additive gradient updates into multiplicative weight updates, which demonstrates an automatic conservation of the (2/L) quasi-norm.

Architecture Design (Transformers, SSMs, MoE)Constitutional AI & AI Ethics Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Implicit Bias in Deep Linear Discriminant Analysis

Related Papers