Mar 10, 2026arXiv:2603.09733

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

Xiaotian Hu, Junwei Huang, Mingxuan Liu, Kasidit Anmahapong, Yifei Chen, Yitong Luo, Yiming Huang, Xuguang Bai, Zihan Li, Yi Liao, Haibo Qu, Qiyuan Tian

AI Summary

The authors introduce FetalAgents, a multi-agent system designed for comprehensive fetal ultrasound image and video analysis, addressing the limitations of existing tools in balancing task-specific accuracy with whole-process versatility. FetalAgents dynamically orchestrates specialized vision experts for diagnosis, measurement, and segmentation tasks, and supports end-to-end video stream summarization by identifying keyframes across anatomical planes and synthesizing them into clinical reports. Multi-center evaluations across eight clinical tasks demonstrate that FetalAgents outperforms specialized models and multimodal large language models, providing a robust and auditable solution for fetal ultrasound analysis.

Key Contribution

FetalAgents leapfrogs existing fetal ultrasound analysis tools by dynamically orchestrating specialized AI agents, outperforming monolithic models across diverse clinical tasks and delivering structured clinical reports from video streams.

Abstract

Fetal ultrasound (US) is the primary imaging modality for prenatal screening, yet its interpretation relies heavily on the expertise of the clinician. Despite advances in deep learning and foundation models, existing automated tools for fetal US analysis struggle to balance task-specific accuracy with the whole-process versatility required to support end-to-end clinical workflows. To address these limitations, we propose FetalAgents, the first multi-agent system for comprehensive fetal US analysis. Through a lightweight, agentic coordination framework, FetalAgents dynamically orchestrates specialized vision experts to maximize performance across diagnosis, measurement, and segmentation. Furthermore, FetalAgents advances beyond static image analysis by supporting end-to-end video stream summarization, where keyframes are automatically identified across multiple anatomical planes, analyzed by coordinated experts, and synthesized with patient metadata into a structured clinical report. Extensive multi-center external evaluations across eight clinical tasks demonstrate that FetalAgents consistently delivers the most robust and accurate performance when compared against specialized models and multimodal large language models (MLLMs), ultimately providing an auditable, workflow-aligned solution for fetal ultrasound analysis and reporting.

Computer Vision Multimodal Models Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis

Related Papers