Search papers, labs, and topics across Lattice.
This study investigates the potential of open-source small language models (SLMs) for reliable and privacy-preserving clinical triage, specifically Emergency Severity Index (ESI) assignment. They found that Qwen2.5-7B, when prompted with clinical vignettes, offered the best balance of accuracy and efficiency. Crucially, domain adaptation via fine-tuning on expert-curated and silver-standard pediatric triage data allowed Qwen2.5-7B to outperform both baseline SLMs and advanced proprietary LLMs like GPT-4o in reducing discordance and clinically significant errors.
Forget giant LLMs: fine-tuned small language models can actually *beat* GPT-4o on critical clinical tasks like emergency triage.
Accurate and consistent Emergency Severity Index (ESI) assignment remains a persistent challenge in emergency departments, where highly variable free-text triage documentation contributes to mistriage and workflow inefficiencies. This study evaluates whether open-source small language models (SLMs) can serve as reliable, privacy-preserving decision-support tools for clinical triage. We systematically compared multiple SLMs across diverse prompting pipelines and found that clinical vignettes, concise summaries of triage narratives, yielded the most accurate predictions. The SLM, Qwen2.5-7B, demonstrated the strongest balance of accuracy, stability, and computational efficiency. Through large-scale domain adaptation using expert-curated and silver-standard pediatric triage data, fine-tuned Qwen2.5-7B models substantially reduced discordance and clinically significant errors, outperforming all baseline SLMs and advanced proprietary large language models (LLMs, e.g., GPT-4o). These findings highlight the feasibility of institution-specific SLMs for reliable, privacy-preserving ESI decision support and underscore the importance of targeted fine-tuning over more complex inference strategies.