Search papers, labs, and topics across Lattice.
Dialectic-Med, a multi-agent framework, mitigates diagnostic hallucinations in multimodal LLMs by simulating adversarial debate. It uses a proponent agent to generate hypotheses, an opponent agent with a novel visual falsification module to retrieve contradictory evidence, and a mediator to resolve conflicts. Experiments on medical VQA datasets show Dialectic-Med achieves state-of-the-art performance, enhances explanation faithfulness, and reduces hallucinations compared to single-agent baselines.
Simulating adversarial debate between specialized agents dramatically reduces hallucinations in medical diagnosis MLLMs, surpassing single-agent baselines in accuracy and trustworthiness.
Multimodal Large Language Models (MLLMs) in healthcare suffer from severe confirmation bias, often hallucinating visual details to support initial, potentially erroneous diagnostic hypotheses. Existing Chain-of-Thought (CoT) approaches lack intrinsic correction mechanisms, rendering them vulnerable to error propagation. To bridge this gap, we propose Dialectic-Med, a multi-agent framework that enforces diagnostic rigor through adversarial dialectics. Unlike static consensus models, Dialectic-Med orchestrates a dynamic interplay between three role-specialized agents: a proponent that formulates diagnostic hypotheses; an opponent equipped with a novel visual falsification module that actively retrieves contradictory visual evidence to challenge the Proponent; and a mediator that resolves conflicts via a weighted consensus graph. By explicitly modeling the cognitive process of falsification, our framework guarantees that diagnostic reasoning is tightly grounded in verified visual regions. Empirical evaluations on MIMIC-CXR-VQA, VQA-RAD, and PathVQA demonstrate that Dialectic-Med not only achieves state-of-the-art performance but also fundamentally enhances the trustworthiness of the reasoning process. Beyond accuracy, our approach significantly enhances explanation faithfulness and decisively mitigates hallucinations, establishing a new standard over single-agent baselines.