CambridgeHUSTSMUTU EindhovenApr 2, 2026arXiv:2604.01526

Learning ECG Image Representations via Dual Physiological-Aware Alignments

Hung Manh Pham, Hung Manh Pham, Jialu Tang, Jialu Tang, Aaqib Saeed, Dong Ma, Dong Ma, Bin Zhu, Panyan Zhou, Pan Zhou

AI Summary

ECG-Scan, a self-supervised framework, learns clinically relevant representations from ECG images by aligning image features with corresponding signal-text modalities via contrastive learning. The framework further incorporates physiological domain knowledge through soft-lead constraints, regularizing reconstruction and enforcing inter-lead consistency. Benchmarking shows ECG-Scan outperforms existing image-based methods and closes the performance gap with signal-based ECG analysis across multiple datasets and downstream tasks.

Key Contribution

Unlock vast troves of legacy ECG image data for automated cardiovascular diagnostics with a self-supervised framework that rivals signal-based analysis.

Abstract

Electrocardiograms (ECGs) are among the most widely used diagnostic tools for cardiovascular diseases, and a large amount of ECG data worldwide appears only in image form. However, most existing automated ECG analysis methods rely on access to raw signal recordings, limiting their applicability in real-world and resource-constrained settings. In this paper, we present ECG-Scan, a self-supervised framework for learning clinically generalized representations from ECG images through dual physiological-aware alignments: 1) Our approach optimizes image representation learning using multimodal contrastive alignment between image and gold-standard signal-text modalities. 2) We further integrate domain knowledge via soft-lead constraints, regularizing the reconstruction process and improving signal lead inter-consistency. Extensive benchmarking across multiple datasets and downstream tasks demonstrates that our image-based model achieves superior performance compared to existing image baselines and notably narrows the gap between ECG image and signal analysis. These results highlight the potential of self-supervised image modeling to unlock large-scale legacy ECG data and broaden access to automated cardiovascular diagnostics.

Computer Vision Multimodal Models Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References49

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Learning ECG Image Representations via Dual Physiological-Aware Alignments

Related Papers