Search papers, labs, and topics across Lattice.
This paper introduces JI-ADF, a trimodal deep learning framework for skin lesion classification that integrates dermoscopic images, clinical photographs, and patient metadata. The architecture uses joint multimodal representation learning with modality-specific auxiliary supervision and an adaptive decision fusion mechanism to dynamically calibrate modality contributions. Evaluated on the MILK10k benchmark, JI-ADF demonstrates improved sensitivity and Dice score while maintaining high specificity and good calibration compared to unimodal approaches.
Fusing dermoscopic images, clinical photos, and patient metadata with adaptive weighting dramatically improves skin lesion classification, even in imbalanced, real-world clinical datasets.
Skin lesion classification is essential for early dermatological diagnosis, yet many existing computer-aided systems rely primarily on dermoscopic images and underutilize the multimodal evidence routinely available in clinical practice. To address this gap, we propose \textbf{JI-ADF}, a trimodal deep learning framework that integrates dermoscopic images, clinical photographs, and structured patient metadata for clinically grounded skin lesion classification. The proposed architecture combines joint multimodal representation learning with modality-specific auxiliary supervision and an adaptive decision fusion mechanism that dynamically calibrates modality contributions on a per-sample basis. To enhance cross-modal reasoning while preserving modality-specific evidence, we further introduce a multimodal fusion attention (MMFA) module. We evaluate JI-ADF on the large-scale MILK10k benchmark, which reflects real-world clinical acquisition conditions and severe class imbalance. The proposed method demonstrates strong and well-balanced performance across lesion categories, improving sensitivity and Dice score while maintaining high specificity and good calibration. Extensive analyses, including modality ablation, calibration evaluation, and Grad-CAM visualization, further confirm the robustness and clinically meaningful behavior of the model. These results indicate that JI-ADF provides a reliable and practical foundation for multimodal skin lesion classification in real-world clinical settings.