Search papers, labs, and topics across Lattice.
This paper investigates the relationship between calibration and curvature during the training of deep neural networks, proposing that calibration should be addressed as an integral part of the training process rather than a post-hoc adjustment. The authors demonstrate that Expected Calibration Error (ECE) is closely linked to curvature-based sharpness, revealing a mathematical connection between ECE and Gauss-Newton curvature that is influenced by margin-dependent factors. By introducing a margin-aware training objective, they achieve enhanced out-of-sample calibration while maintaining high accuracy across various optimization methods.
Calibration can be effectively improved during training by focusing on curvature and margin dynamics, leading to better confidence estimates without sacrificing model performance.
Modern neural networks can achieve high accuracy while remaining poorly calibrated, producing confidence estimates that do not match empirical correctness. Yet calibration is often treated as a post-hoc attribute. We take a different perspective: we study calibration as a training-time phenomenon on small vision tasks, and ask whether calibrated solutions can be obtained reliably by intervening on the training procedure. We identify a tight coupling between calibration, curvature, and margins during training of deep networks under multiple gradient-based methods. Empirically, Expected Calibration Error (ECE) closely tracks curvature-based sharpness throughout optimization. Mathematically, we show that both ECE and Gauss--Newton curvature are controlled, up to problem-specific constants, by the same margin-dependent exponential tail functional along the trajectory. Guided by this mechanism, we introduce a margin-aware training objective that explicitly targets robust-margin tails and local smoothness, yielding improved out-of-sample calibration across optimizers without sacrificing accuracy.