COMPACTFeb 23, 2026arXiv:2602.19799

Path-conditioned training: a principled way to rescale ReLU neural networks

Arthur Lebeurrier, Titouan Vayer, Rémi Gribonval

AI Summary

This paper introduces a path-conditioned training strategy for ReLU neural networks that leverages rescaling symmetries by aligning a kernel in the path-lifting space with a chosen reference. They derive an efficient algorithm to minimize a geometrically motivated criterion for rescaling network parameters, effectively conditioning the training process. The authors demonstrate through analysis and experiments that this approach can improve training speed by carefully considering the interplay between architecture and initialization scale.

Key Contribution

Rescaling ReLU networks using a geometrically-motivated path-conditioning strategy can significantly speed up training by aligning a kernel in the path-lifting space.

Abstract

Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Path-conditioned training: a principled way to rescale ReLU neural networks

Related Papers