ESASheffieldApr 28, 2026arXiv:2604.25568

Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

Haolin Wang, Xianyuan Liu, Anna Jungbluth, Alexandra J. Ramadan, Robert D. J. Oliver, Haiping Lu

AI Summary

The paper introduces RealMat-BaG, a new benchmark dataset of experimental semiconductor bandgaps with aligned crystal structures, designed to evaluate the generalization capabilities of machine learning models under realistic conditions. Through rigorous evaluation of graph neural networks and classical ML baselines, the study reveals significant limitations in the ability of current models to generalize from DFT-computed data to experimental measurements. The framework also incorporates interpretability analysis at both elemental and structural levels, providing insights into model behavior.

Key Contribution

Current machine learning models for semiconductor bandgap prediction fall short when faced with the messy reality of experimental data, highlighting a critical need for more robust and generalizable learning strategies.

Abstract

Accurate bandgap prediction is crucial for semiconductor applications, yet machine learning models trained on computational data often struggle to generalize to experimental bandgap measurements. Challenges related to data fidelity, domain generalization, and model interpretability remain insufficiently addressed in existing evaluation frameworks. To bridge this gap, we introduce RealMat-BaG, a benchmark for assessing model reliability under experimentally relevant conditions. We curate an open-access dataset of experimental bandgaps with aligned crystal structures and compare graph neural networks as well as classical machine learning baselines. Our framework evaluates performance across statistical and domain-based splits, examines transfer from DFT-computed to experimental bandgaps, and analyzes interpretability at both elemental-property and structural levels. Our results reveal the fundamental generalization limitations of current bandgap prediction models and establish a benchmark aligned with experimental measurements for developing more reliable learning strategies for materials discovery.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

Related Papers