Search papers, labs, and topics across Lattice.
This paper addresses class imbalance in skin lesion classification by generating synthetic dermatological images using class-conditioned diffusion models and then pretraining a large ViT model using masked autoencoders (MAE) on this synthetic data. They then distill the knowledge from the large ViT model to a smaller ViT student model suitable for mobile deployment. The results demonstrate improved classification performance and efficient on-device inference due to the MAE pretraining on synthetic data and subsequent distillation.
Synthetic data from class-conditioned diffusion models, combined with MAE pretraining, allows smaller ViTs to achieve state-of-the-art skin lesion classification, opening the door for mobile clinical applications.
Skin lesion classification datasets often suffer from severe class imbalance, with malignant cases significantly underrepresented, leading to biased decision boundaries during deep learning training. We address this challenge using class-conditioned diffusion models to generate synthetic dermatological images, followed by self-supervised MAE pretraining to enable huge ViT models to learn robust, domain-relevant features. To support deployment in practical clinical settings, where lightweight models are required, we apply knowledge distillation to transfer these representations to a smaller ViT student suitable for mobile devices. Our results show that MAE pretraining on synthetic data, combined with distillation, improves classification performance while enabling efficient on-device inference for practical clinical use.