UWCardiovascular Research CenterEastern New Mexico Medical CenterMcGillMHIUniversity of California-San FranciscoMar 5, 2025

Foundation models for generalizable electrocardiogram interpretation: comparison of supervised and self-supervised electrocardiogram foundation models

A. Nolin-Lapalme, Achille Sowa, Jacques Delfrate, O. Tastet, Denis Corbin, Merve Kulbay, Derman Ozdemir, Marie-Jeanne Noël, François-Christophe Marois-Blanchet, François Harvey, Surbhi Sharma, Minhaj Ansari, I. Chiu, Valentina Dsouza, Sam F. Friedman, Michaël Chassé, Brian J Potter, Jonathan Afilalo, P. Elias, Gilbert Jabbour, M. Bahani, Marie-Pierre Dubé, Patrick M. Boyle, Neal A. Chatterjee, Joshua P. Barrios, Geoffrey H. Tison, D. Ouyang, M. Maddah, Shaan Khurshid, Julia Cadrin-Tourigny, R. Tadros, J. Hussin, R. Avram

AI Summary

The authors developed and compared two open-source foundation models for ECG interpretation: DeepECG-SSL, a self-supervised model pretrained with contrastive learning and masked lead modeling, and DeepECG-SL, a supervised model. Both models were trained on over 1 million ECGs to predict 77 cardiac conditions and were evaluated on multiple datasets for ECG interpretation and digital biomarker tasks. DeepECG-SSL outperformed DeepECG-SL on digital biomarker tasks with limited labeled data, demonstrating the potential of self-supervised learning for ECG analysis, while both models showed minimal performance disparities across age and gender.

Key Contribution

Self-supervised learning beats supervised learning for ECG interpretation when labeled data is scarce, unlocking more robust and generalizable AI-driven cardiac diagnostics.

Abstract

Background: The 12-lead electrocardiogram (ECG) remains a cornerstone of cardiac diagnostics, yet existing artificial intelligence (AI) solutions for automated interpretation have limited generalizability, are closed-source, and are primarily trained using supervised learning, limiting their adaptability across diverse clinical settings. To address these challenges, we developed, compared and released as open source two foundational ECG models: DeepECG-SSL, a self-supervised learning model, and DeepECG-SL, a supervised learning model. Methods: Both models were trained on over 1 million ECGs using a standardized preprocessing pipeline and automated free-text extraction from ECG reports to predict 77 cardiac conditions. DeepECG-SSL was pretrained using self-supervised contrastive learning and masked lead modeling. The models were evaluated on six multilingual private healthcare systems and four public datasets for ECG interpretation across 77 diagnostic categories and for left ventricular ejection fraction (LVEF) prediction [≤] 40% and [≤]50%, long QT genotype identification and 5-year incident atrial fibrillation (AF) prediction. Fairness analyses assessed disparities in performance across age and gender groups. Results: DeepECG-SSL achieved AUROCs of 0.989 (internal test), 0.981 (public datasets), and 0.983 (private datasets), while DeepECG-SL demonstrated AUROCs of 0.992, 0.980, and 0.983, respectively. For digital biomarker tasks with limited labeled data, DeepECG-SSL outperformed DeepECG-SL in predicting 5-year atrial fibrillation risk (N=132,050; AUROC 0.742 vs. 0.720; {Delta}=0.022; P<0.001), identifying reduced left ventricular ejection fraction [≤]40% (N=25,252; 0.928 vs. 0.900; {Delta}=0.028; P<0.001), and classifying long QT syndrome subtypes (N=127; 0.931 vs. 0.853; {Delta}=0.078; P=0.026). Fairness analyses revealed minimal disparities (True positive rate & Fasle positive rate difference<0.010) across age and gender groups. Conclusion: This study establishes self-supervised learning as a promising paradigm for ECG analysis, particularly in settings with limited annotated data, enhancing accessibility, generalizability, and fairness in AI-driven cardiac diagnostics. By releasing model weights, preprocessing tools, and validation code, we aim to support robust, low-data-friendly AI diagnostics across diverse clinical environments.

Open-Source Models & Weights Scientific Discovery & Drug Design Training Efficiency & Optimization

Citation Metrics

Citations2

Influential citations0

References19

Year2025

VenuemedRxiv

Related Papers

Finding related papers...

Search

Foundation models for generalizable electrocardiogram interpretation: comparison of supervised and self-supervised electrocardiogram foundation models

Related Papers