Microsoft ResearchJul 7, 2025arXiv:2507.05517

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Jean-Philippe Corbeil, Asma Ben Abacha, George Michalopoulos, Phillip Swazinna, Miguel Del-Agua, J. Tremblay, Akila Jeeson Daniel, Cari Bader, Yu-Cheng Cho, Pooja Krishnan, Nathan Bodenstab, Thomas Lin, Wenxuan Teng, Francois Beaulieu, Paul Vozila

AI Summary

This paper addresses the underexplored NLP tasks of structured tabular reporting from nurse dictations and medical order extraction from doctor-patient consultations, which are critical for reducing healthcare provider documentation burden. The authors evaluate the performance of both open- and closed-weight LLMs on these tasks using private and newly released open-source datasets (SYNUR and SIMORD). They also propose an agentic pipeline for generating realistic, non-sensitive nurse dictations to facilitate structured extraction of clinical observations.

Key Contribution

LLMs can now automate structured reporting from nurse dictations and medical order extraction from doctor-patient consultations, thanks to two new open-source datasets and an agentic pipeline for generating realistic training data.

Abstract

Large language models (LLMs) such as GPT-4o and o1 have demonstrated strong performance on clinical natural language processing (NLP) tasks across multiple medical benchmarks. Nonetheless, two high-impact NLP tasks - structured tabular reporting from nurse dictations and medical order extraction from doctor-patient consultations - remain underexplored due to data scarcity and sensitivity, despite active industry efforts. Practical solutions to these real-world clinical tasks can significantly reduce the documentation burden on healthcare providers, allowing greater focus on patient care. In this paper, we investigate these two challenging tasks using private and open-source clinical datasets, evaluating the performance of both open- and closed-weight LLMs, and analyzing their respective strengths and limitations. Furthermore, we propose an agentic pipeline for generating realistic, non-sensitive nurse dictations, enabling structured extraction of clinical observations. To support further research in both areas, we release SYNUR and SIMORD, the first open-source datasets for nurse observation extraction and medical order extraction.

Natural Language Processing Speech & Audio Tool Use & Agents

Citation Metrics

Citations4

Influential citations0

References41

Year2025

VenueConference on Empirical Methods in Natural Language Processing

Related Papers

Finding related papers...

Search

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

Related Papers