Mar 31, 2026arXiv:2603.29429

CounselReflect: A Toolkit for Auditing Mental-Health Dialogues

Yahan Li, Chaohao Du, Zeyang Li, Christopher Chun Kuizon, Shupeng Cheng, A. Hwang, Adam C. Frank, Ruishan Liu

AI Summary

CounselReflect is introduced as a toolkit for auditing mental-health support dialogues, providing structured, multi-dimensional reports instead of a single quality score. It integrates 12 model-based metrics from task-specific predictors and 69 rubric-based metrics derived from literature, operationalized with configurable LLM judges. User studies and expert reviews suggest CounselReflect supports understandable, usable, and trustworthy auditing of mental-health dialogues.

Key Contribution

Mental-health support chatbots get a much-needed reality check with CounselReflect, a toolkit that exposes their strengths and weaknesses through transparent, multi-dimensional audits.

Abstract

Mental-health support is increasingly mediated by conversational systems (e.g., LLM-based tools), but users often lack structured ways to audit the quality and potential risks of the support they receive. We introduce CounselReflect, an end-to-end toolkit for auditing mental-health support dialogues. Rather than producing a single opaque quality score, CounselReflect provides structured, multi-dimensional reports with session-level summaries, turn-level scores, and evidence-linked excerpts to support transparent inspection. The system integrates two families of evaluation signals: (i) 12 model-based metrics produced by task-specific predictors, and (ii) rubric-based metrics that extend coverage via a literature-derived library (69 metrics) and user-defined custom metrics, operationalized with configurable LLM judges. CounselReflect is available as a web application, browser extension, and command-line interface (CLI), enabling use in real-time settings as well as at scale. Human evaluation includes a user study with 20 participants and an expert review with 6 mental-health professionals, suggesting that CounselReflect supports understandable, usable, and trustworthy auditing. A demo video and full source code are also provided.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

CounselReflect: A Toolkit for Auditing Mental-Health Dialogues

Related Papers