DKFZHeidelbergHelmholtzUniversity Medical CenterApr 24, 2025

Semantic hyperspectral image synthesis for cross-modality knowledge transfer in surgical data science

Viet Tran Ba, Marco Hübner, Ahmad Bin Qasim, Maike Rees, J. Sellner, S. Seidlitz, Evangelia Christodoulou, B. Özdemir, A. Studier-Fischer, Felix Nickel, L. Ayala, Lena Maier-Hein

AI Summary

This paper addresses the scarcity of surgical hyperspectral imaging (HSI) datasets by proposing a novel approach to cross-modality knowledge transfer using generative modeling. They employ a latent diffusion model (LDM) to synthesize realistic HSI images from semantic segmentation masks derived from other modalities, enabling the transfer of geometric information. Experiments on a dataset of over 13,000 HSI images demonstrate that LDMs can generate realistic high-resolution HSI images, even with limited training data or out-of-distribution annotations, leading to a significant performance boost (up to 35% Dice) in semantic segmentation.

Key Contribution

Overcome the surgical HSI data bottleneck: a latent diffusion model generates realistic hyperspectral images from segmentation masks, even those derived from different modalities, boosting segmentation performance by 35%.

Abstract

Hyperspectral imaging (HSI) is a promising intraoperative imaging modality, with potential applications ranging from tissue classification and discrimination to perfusion monitoring and cancer detection. However, surgical HSI datasets are scarce, hindering the development of robust data-driven algorithms. The purpose of this work was to address this critical bottleneck with a novel approach to knowledge transfer across modalities. We propose the use of generative modeling to leverage imaging data across optical imaging modalities. The core of the method is a latent diffusion model (LDM) capable of converting a semantic segmentation mask obtained from any modality into a realistic hyperspectral image, such that geometry information can be learned across modalities. The value of the approach was assessed both qualitatively and quantitatively using surgical scene segmentation as a downstream task. Our study with more than 13,000 hyperspectral images, partially annotated with a total of 37 tissue and object classes, suggests that LDMs are well-suited for the synthesis of realistic high-resolution hyperspectral images even when trained on few samples or applied to annotations from different modalities and geometric out-of-distribution annotations. Using our approach for generative augmentation yielded a performance boost of up to 35% in the Dice similarity coefficient for the task of semantic hyperspectral image segmentation. As our method is capable of augmenting HSI datasets in a manner agnostic to the modality of the leveraged data, it could serve as a blueprint for addressing the data bottleneck encountered for novel imaging modalities.

Computer Vision Data Curation & Synthetic Data Multimodal Models

Citation Metrics

Citations1

Influential citations0

References42

Year2025

VenueInternational Journal of Computer Assisted Radiology and Surgery

Related Papers

Finding related papers...

Search

Semantic hyperspectral image synthesis for cross-modality knowledge transfer in surgical data science

Related Papers