Search papers, labs, and topics across Lattice.
This paper introduces MAGIS, a novel framework for strabismus diagnosis that integrates evidence-based multi-agent reasoning to enhance interpretability and accuracy in clinical decision-making. By employing a Dual-Evidence Constrained Context (DECC) mechanism, MAGIS organizes visual evidence and clinical rules to facilitate reliable diagnostic reasoning, while the Evidence-Based Corrective Verification (EBCV) mechanism ensures hypothesis refinement when inconsistencies arise. Experimental results show that MAGIS significantly improves diagnostic performance, achieving a weighted F1 score of 91.3%, and enhances the reliability of generated reports compared to existing systems.
MAGIS transforms strabismus diagnosis from a black-box process into a transparent, evidence-driven framework that boosts accuracy and clinical reliability.
Strabismus is a common ocular disorder that requires fine-grained subtype diagnosis for individualized treatment planning. However, existing deep learning methods mainly provide diagnostic predictions without transparent reasoning, while recent large vision-language models (LVLMs), although promising for joint image understanding and report generation, remain highly prone to hallucination in this evidence-sensitive and rule-driven medical task. To address these challenges, we propose MAGIS, an evidence-based Multi-AGent reasoning for Interpretable Strabismus diagnosis framework. MAGIS transforms black-box end-to-end generation into a structured diagnostic process consisting of candidate hypothesis generation, dual-evidence constrained context, evidence-based corrective verification, and report generation. Specifically, we introduce a Dual-Evidence Constrained Context (DECC) mechanism that jointly organizes visual evidence from the photograph of the nine cardinal positions of gaze and evidence-based clinical diagnostic rules into a constrained context for reliable diagnostic reasoning. We further develop an Evidence-Based Corrective Verification (EBCV) mechanism that verifies whether the current diagnostic hypothesis is supported by visual evidence, heatmap-based visual cues, and evidence-based clinical diagnostic rules. Hypothesis refinement is triggered when inconsistency is detected. Experiments on a fine-grained strabismus benchmark demonstrate that MAGIS not only significantly outperforms other state-of-the-art diagnostic systems, improving the weighted F1 score from 72.0% to 91.3%, but also substantially improves the clinical reliability (consistency, alignment, and completeness) of generated diagnostic reports. These results demonstrate that MAGIS provides an effective solution for building accurate, evidence-based, and clinically interpretable strabismus diagnosis systems.