Search papers, labs, and topics across Lattice.
The paper introduces MIND, a reinforcement learning framework for psychiatric consultation that addresses the challenges of subjective ambiguity and comorbidity complexity in medical dialogue systems. MIND uses a Criteria-Grounded Psychiatric Reasoning Bank (PRB) to retrieve similar consultations and distill criteria-grounded clinical supports, guiding inquiry and reasoning. The framework also incorporates rubric-based process rewards and value-aware trajectory rectification to improve information acquisition and diagnostic accuracy across multiple turns.
Achieve more accurate psychiatric diagnoses and empathetic interactions by grounding LLM agents in a clinical reasoning bank and explicitly rewarding rubric-based reasoning steps.
Large language models (LLMs) have advanced medical dialogue systems, yet psychiatric consultation poses substantially higher demands due to subjective ambiguity and comorbidity complexity: an agent must continuously extract psychopathological cues from incomplete and inconsistent patient reports in multi-turn interactions and perform rigorous differential diagnostic reasoning. However, existing methods face two fundamental challenges. First, without criteria-grounded clinical supports, they are prone to unsupported clinical assertions when symptoms are atypical or underspecified. Second, in multi-turn interactions, they struggle to mitigate inquiry drift (off-topic or low-yield questioning) and optimize questioning strategies. To address these challenges, we propose MIND, a unified inquiry--diagnosis reinforcement learning framework for psychiatric consultation. Specifically, we build a Criteria-Grounded Psychiatric Reasoning Bank (PRB) that summarizes dialogue context into clinical retrieval states, retrieves semantically similar reference consultations, and distills reusable criteria-grounded clinical supports to guide criteria-aligned inquiry and reasoning. Building on this foundation, MIND enforces explicit clinical reasoning with rubric-based process rewards to provide fine-grained supervision over intermediate decision steps, and incorporates a value-aware trajectory rectification mechanism to jointly improve information acquisition and diagnostic decision-making across turns. Extensive experiments demonstrate that MIND consistently outperforms strong baselines in diagnostic accuracy, empathetic interaction quality, interpretability, and generalization.