Search papers, labs, and topics across Lattice.
This paper introduces Generative Cross-Entropy (GCE), a novel loss function derived from a generative perspective that maximizes $p(x|y)$ to improve both accuracy and calibration in deep neural networks. GCE effectively regularizes cross-entropy with a class-level confidence term, addressing the overconfidence issue common in DNNs trained with negative log-likelihood. Experiments across multiple datasets, including long-tailed scenarios, demonstrate that GCE achieves superior accuracy and calibration compared to standard cross-entropy, and, when combined with adaptive temperature scaling, matches the calibration performance of focal loss variants without sacrificing accuracy.
DNNs can be both more accurate *and* better calibrated: Generative Cross-Entropy (GCE) offers a way to escape the accuracy-calibration trade-off.
Accurate classification requires not only high predictive accuracy but also well-calibrated confidence estimates. Yet, modern deep neural networks (DNNs) are often overconfident, primarily due to overfitting on the negative log-likelihood (NLL). While focal loss variants alleviate this issue, they typically reduce accuracy, revealing a persistent trade-off between calibration and predictive performance. Motivated by the complementary strengths of generative and discriminative classifiers, we propose Generative Cross-Entropy (GCE), which maximizes $p(x|y)$ and is equivalent to cross-entropy augmented with a class-level confidence regularizer. Under mild conditions, GCE is strictly proper. Across CIFAR-10/100, Tiny-ImageNet, and a medical imaging benchmark, GCE improves both accuracy and calibration over cross-entropy, especially in the long-tailed scenario. Combined with adaptive piecewise temperature scaling (ATS), GCE attains calibration competitive with focal-loss variants without sacrificing accuracy.