Stanford HAIAnsibleHealth Inc.George Washington UniversityMar 10, 2026arXiv:2603.09052

From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring

SeungHwan Kim, Tiffany H. Kung, H. Verma, Dilan Edirisinghe, Kaveh Sedehi, Johanna Alvarez, D. Shilling, A. Doyle, Ajit Chary, W. Borden, Ming Jack Po AnsibleHealth Inc., San Francisco, Usa Department of Medicine, Stanford, Usa George Washington University, Washington, D.C., Usa

AI Summary

Sentinel, an autonomous AI agent using a Model Context Protocol (MCP), was developed to triage remote patient monitoring (RPM) vitals by synthesizing contextual information from 21 clinical tools and multi-step reasoning. Evaluated against clinicians and rule-based systems, Sentinel demonstrated superior emergency sensitivity (95.8%) and actionable alert sensitivity (88.5%), while also exhibiting near-perfect self-consistency. The agent's performance, combined with a median cost of $0.34/triage, suggests a scalable and cost-effective solution for RPM data overload.

Key Contribution

An AI agent can triage remote patient monitoring data with higher sensitivity than individual clinicians, suggesting a path to scalable and cost-effective patient monitoring.

Abstract

Background: Remote patient monitoring (RPM) generates vast data, yet landmark trials (Tele-HF, BEAT-HF) failed because data volume overwhelmed clinical staff. While TIM-HF2 showed 24/7 physician-led monitoring reduces mortality by 30%, this model remains prohibitively expensive and unscalable. Methods: We developed Sentinel, an autonomous AI agent using Model Context Protocol (MCP) for contextual triage of RPM vitals via 21 clinical tools and multi-step reasoning. Evaluation included: (1) self-consistency (100 readings x 5 runs); (2) comparison against rule-based thresholds; and (3) validation against 6 clinicians (3 physicians, 3 NPs) using a connected matrix design. A leave-one-out (LOO) analysis compared the agent against individual clinicians; severe overtriage cases underwent independent physician adjudication. Results: Against a human majority-vote standard (N=467), the agent achieved 95.8% emergency sensitivity and 88.5% sensitivity for all actionable alerts (85.7% specificity). Four-level exact accuracy was 69.4% (quadratic-weighted kappa=0.778); 95.9% of classifications were within one severity level. In LOO analysis, the agent outperformed every clinician in emergency sensitivity (97.5% vs. 60.0% aggregate) and actionable sensitivity (90.9% vs. 69.5%). While disagreements skewed toward overtriage (22.5%), independent adjudication of severe gaps (>=2 levels) validated agent escalation in 88-94% of cases; consensus resolution validated 100%. The agent showed near-perfect self-consistency (kappa=0.850). Median cost was $0.34/triage. Conclusions: Sentinel triages RPM vitals with sensitivity exceeding individual clinicians. By automating systematic context synthesis, Sentinel addresses the core limitation of prior RPM trials, offering a scalable path toward the intensive monitoring shown to reduce mortality while maintaining a clinically defensible overtriage profile.

Reasoning & Chain-of-Thought Scientific Discovery & Drug Design Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring

Related Papers