StuttgartFeb 17, 2026arXiv:2602.15747

How to Train a Shallow Ensemble

Moritz Schäfer, Matthias Kellner, Johannes Kästner, Michele Ceriotti

AI Summary

The paper investigates training strategies for shallow ensembles of machine learning interatomic potentials to improve uncertainty quantification while balancing computational cost. They demonstrate that directly optimizing a negative log-likelihood (NLL) loss, especially for forces, enhances calibration compared to random initialization or Laplace approximation methods. The key result is an efficient two-step protocol: first training with a probabilistic energy loss or sampling from the Laplace posterior, followed by full-model fine-tuning with a force-based NLL objective, achieving comparable calibration to training from scratch with a 96% reduction in training time.

Key Contribution

Achieve up to 96% training time reduction for calibrated machine learning interatomic potentials by fine-tuning shallow ensembles, without sacrificing calibration quality.

Abstract

Shallow ensembles provide a convenient strategy for uncertainty quantification in machine learning interatomic potentials, that is computationally efficient because the different ensemble members share a large part of the model weights. In this work, we systematically investigate training strategies for shallow ensembles to balance calibration performance with computational cost. We first demonstrate that explicit optimization of a negative log-likelihood (NLL) loss improves calibration with respect to approaches based on ensembles of randomly initialized models, or on a last-layer Laplace approximation. However, models trained solely on energy objectives yield miscalibrated force estimates. We show that explicitly modeling force uncertainties via an NLL objective is essential for reliable calibration, though it typically incurs a significant computational overhead. To address this, we validate an efficient protocol: full-model fine-tuning of a shallow ensemble originally trained with a probabilistic energy loss, or one sampled from the Laplace posterior. This approach results in negligible reduction in calibration quality compared to training from scratch, while reducing training time by up to 96%. We evaluate this protocol across a diverse range of materials, including amorphous carbon, ionic liquids (BMIM), liquid water (H$_2$O), barium titanate (BaTiO$_3$), and a model tetrapeptide (Ac-Ala3-NHMe), establishing practical guidelines for reliable uncertainty quantification in atomistic machine learning.

Scientific Discovery & Drug Design Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

How to Train a Shallow Ensemble

Related Papers