Search papers, labs, and topics across Lattice.
GazeVaLM, a new eye-tracking dataset, captures the gaze patterns of 16 expert radiologists evaluating 30 real and 30 AI-generated chest X-rays under diagnostic and real-fake classification tasks. The dataset also includes diagnostic labels, authenticity judgments, and predictions from 6 multimodal LLMs, enabling direct comparison of human and AI perception. Analysis reveals insights into gaze agreement, inter-observer consistency, diagnostic accuracy, and authenticity detection, providing a benchmark for evaluating clinical realism in AI-generated medical images.
Radiologists' gaze patterns on real vs. AI-generated X-rays, now publicly available, reveal critical differences in how experts visually process and interpret synthetic medical imagery.
We introduce GazeVaLM, a public eye-tracking dataset for studying clinical perception during chest radiograph authenticity assessment. The dataset comprises 960 gaze recordings from 16 expert radiologists interpreting 30 real and 30 synthetic chest X-rays (generated by diffusion based generative AI) under two conditions: diagnostic assessment and real-fake classification (Visual Turing test). For each image-observer pair, we provide raw gaze samples, fixation maps, scanpaths, saliency density maps, structured diagnostic labels, and authenticity judgments. We extend the protocol to 6 state-of-the-art multimodal LLMs, releasing their predicted diagnoses, authenticity labels, and confidence scores under matched conditions - enabling direct human-AI comparison at both decision and uncertainty levels. We further provide analyses of gaze agreement, inter-observer consistency, and benchmarking of radiologists versus LLMs in diagnostic accuracy and authenticity detection. GazeVaLM supports research in gaze modeling, clinical decision-making, human-AI comparison, generative image realism assessment, and uncertainty quantification. By jointly releasing visual attention data, clinical labels, and model predictions, we aim to facilitate reproducible research on how experts and AI systems perceive, interpret, and evaluate medical images. The dataset is available at https://huggingface.co/datasets/davidcwong/GazeVaLM.