Search papers, labs, and topics across Lattice.
This paper explores the use of Vision Transformers (ViTs) for classifying lymphoma subtypes, specifically anaplastic large cell lymphoma (ALCL) versus classic Hodgkin lymphoma (cHL). To address the impracticality of fully supervised training due to resource constraints, the authors implemented a weakly supervised training approach using slide-level labels for image patches. The resulting ViT model, trained on 100,000 image patches, achieved a diagnostic accuracy of 91.85%, F1 score of 0.92, and AUC of 0.98, demonstrating its potential for clinical application.
Weakly supervised ViTs can achieve high accuracy in lymphoma classification, offering a practical alternative to fully supervised methods that require extensive manual annotation.
Vision transformers (ViT) have been shown to allow for more flexible feature detection and can outperform convolutional neural network (CNN) when pre-trained on sufficient data. Due to their promising feature detection capabilities, we deployed ViTs for morphological classification of anaplastic large cell lymphoma (ALCL) versus classic Hodgkin lymphoma (cHL). We had previously designed a ViT model which was trained on a small dataset of 1,200 image patches in fully supervised training. That model achieved a diagnostic accuracy of 100% and an F1 score of 1.0 on the independent test set. Since fully supervised training is not a practical method due to lack of expertise resources in both the training and testing phases, we conducted a recent study on a modified approach to training data (weakly supervised training) and show that labeling training image patch automatically at the slide level of each whole-slide-image is a more practical solution for clinical use of Vision Transformer. Our ViT model, trained on a larger dataset of 100,000 image patches, yields evaluation metrics with significant accuracy, F1 score, and area under the curve (AUC) at 91.85%, 0.92, and 0.98, respectively. These are respectable values that qualify this ViT model, with weakly supervised training, as a suitable tool for a deep learning module in clinical model development using automated image patch extraction.