Search papers, labs, and topics across Lattice.
RadAnnotate leverages LLMs to automate radiology report annotation by training entity-specific classifiers, generating retrieval-augmented synthetic reports, and implementing confidence-based selective automation. Synthetic data augmentation narrows the performance gap between synthetic-only and gold-trained models, particularly for uncertain observations in low-resource scenarios. By selectively automating annotation based on confidence thresholds, RadAnnotate achieves high accuracy (0.86-0.92 entity match score) while significantly reducing the need for expert review (55-90% automation).
LLMs can automate up to 90% of radiology report annotations with high accuracy, slashing expert review time.
Radiology report annotation is essential for clinical NLP, yet manual labeling is slow and costly. We present RadAnnotate, an LLM-based framework that studies retrieval-augmented synthetic reports and confidence-based selective automation to reduce expert effort for labeling in RadGraph. We study RadGraph-style entity labeling (graph nodes) and leave relation extraction (edges) to future work. First, we train entity-specific classifiers on gold-standard reports and characterize their strengths and failure modes across anatomy and observation categories, with uncertain observations hardest to learn. Second, we generate RAG-guided synthetic reports and show that synthetic-only models remain within 1-2 F1 points of gold-trained models, and that synthetic augmentation is especially helpful for uncertain observations in a low-resource setting, improving F1 from 0.61 to 0.70. Finally, by learning entity-specific confidence thresholds, RadAnnotate can automatically annotate 55-90% of reports at 0.86-0.92 entity match score while routing low-confidence cases for expert review.