Feb 26, 2026arXiv:2602.23219

Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

Hiroki Naganuma, Hiroki Naganuma, Taiji Suzuki, Rio Yokota, Rio Yokota, Masahiro Nomura, Masahiro Nomura, Kohta Ishikawa, Kohta Ishikawa, Ikuro Sato, Ikuro Sato

AI Summary

This paper investigates Takeuchi's Information Criterion (TIC) as a generalization measure for deep neural networks (DNNs), particularly focusing on its applicability near the Neural Tangent Kernel (NTK) regime. Through theoretical analysis and experiments on over 5,000 DNN models with various architectures and datasets, the authors demonstrate a strong correlation between estimated TIC values and the generalization gap when DNNs operate close to the NTK regime. The study also reveals that this correlation diminishes outside the NTK regime and that TIC can effectively prune trials during hyperparameter optimization.

Key Contribution

Takeuchi's Information Criterion (TIC) accurately predicts DNN generalization gaps, but only when models operate near the Neural Tangent Kernel (NTK) regime.

Abstract

Generalization measures have been studied extensively in the machine learning community to better characterize generalization gaps. However, establishing a reliable generalization measure for statistically singular models such as deep neural networks (DNNs) is difficult due to their complex nature. This study focuses on Takeuchi's information criterion (TIC) to investigate the conditions under which this classical measure can effectively explain the generalization gaps of DNNs. Importantly, the developed theory indicates the applicability of TIC near the neural tangent kernel (NTK) regime. In a series of experiments, we trained more than 5,000 DNN models with 12 architectures, including large models (e.g., VGG-16), on four datasets, and estimated the corresponding TIC values to examine the relationship between the generalization gap and the TIC estimates. We applied several TIC approximation methods with feasible computational costs and assessed the accuracy trade-off. Our experimental results indicate that the estimated TIC values correlate well with the generalization gap under conditions close to the NTK regime. However, we show both theoretically and empirically that outside the NTK regime such correlation disappears. Finally, we demonstrate that TIC provides better trial pruning ability than existing methods for hyperparameter optimization.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Citation Metrics

Citations1

Influential citations0

References72

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

Related Papers