Feb 26, 2026arXiv:2602.23062

Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department

Gabriela Anna Kaczmarek, Gabriela Anna Kaczmarek, Pietro Ferrazzi, Pietro Ferrazzi, Lorenzo Porta, Lorenzo Porta, Vicky Rubini, Vicky Rubini, Bernardo Magnini, Bernardo Magnini

AI Summary

This paper introduces a new dataset of Italian Emergency Department clinical notes annotated for automatic Case Report Form (CRF) filling, comprising 134 items. The authors define the CRF-filling task and evaluation metric, and conduct pilot experiments using an open-source LLM in a zero-shot setting. Results indicate the feasibility of zero-shot CRF-filling in Italian but highlight biases in LLM outputs, such as a tendency to select "unknown" answers.

Key Contribution

LLMs can fill out medical forms from Italian clinical notes in a zero-shot setting, but watch out for those "unknown" biases.

Abstract

Case Report Forms (CRFs) collect data about patients and are at the core of well-established practices to conduct research in clinical settings. With the recent progress of language technologies, there is an increasing interest in automatic CRF-filling from clinical notes, mostly based on the use of Large Language Models (LLMs). However, there is a general scarcity of annotated CRF data, both for training and testing LLMs, which limits the progress on this task. As a step in the direction of providing such data, we present a new dataset of clinical notes from an Italian Emergency Department annotated with respect to a pre-defined CRF containing 134 items to be filled. We provide an analysis of the data, define the CRF-filling task and metric for its evaluation, and report on pilot experiments where we use an open-source state-of-the-art LLM to automatically execute the task. Results of the case-study show that (i) CRF-filling from real clinical notes in Italian can be approached in a zero-shot setting; (ii) LLMs'results are affected by biases (e.g., a cautious behaviour favours"unknown"answers), which need to be corrected.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References13

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department

Related Papers