Apr 28, 2026arXiv:2604.25120

Diagnosis, Bad Planning&Reasoning. Treatment, SCOPE -- Planning for Hybrid Querying over Clinical Trial Data

Suparno Roy Chowdhury, M. Choudhury, Tejas Anvekar, Muhammad Ali Khan, K. Khakwani, M. Sonbol, I. Riaz, Vivek Gupta

AI Summary

The paper addresses clinical trial table reasoning, where answers require semantic understanding and reasoning beyond direct cell lookups. They identify that LLMs often fail due to implicit planning assumptions when recovering attributes like therapy type or endpoint roles. To mitigate this, they introduce SCOPE, a multi-LLM planner that decomposes the task into row selection, structured planning, and execution, achieving improved accuracy and efficiency compared to existing methods on a dataset of 1,500 oncology clinical-trial questions.

Key Contribution

LLMs struggle with clinical trial reasoning due to implicit planning assumptions, but a multi-LLM planner that explicitly decomposes the task into structured steps significantly improves accuracy and efficiency.

Abstract

We study clinical trial table reasoning, where answers are not directly stored in visible cells but must be reasoned from semantic understanding through normalization, classification, extraction, or lightweight domain reasoning. Motivated by the observation that current LLM approaches often suffer from"bad reasoning"under implicit planning assumptions, we focus on settings in which the model must recover implicit attributes such as therapy type, added agents, endpoint roles, or follow-up status from partially observed clinical-trial tables. We propose SCOPE (Structured Clinical hybrid Planning for Evidence retrieval in clinical trials), a multi-LLM planner-based framework that decomposes the task into row selection, structured planning, and execution. The planner makes the source field, reasoning rules, and output constraints explicit before answer generation, reducing ambiguity relative to direct prompting. We evaluate SCOPE on 1,500 hybrid reasoning questions over oncology clinical-trial tables against zero-shot, few-shot, chain-of-thought, TableGPT2, Blend-SQL, and EHRAgent. Results show that explicit multi-LLM planning improves accuracy for reasoning-based questions while offering a stronger accuracy-efficiency tradeoff than heavier agentic baselines. Our findings position clinical trial reasoning as a distinct table understanding problem and highlight hybrid planner-based decomposition as an effective solution

Reasoning & Chain-of-Thought Scientific Discovery & Drug Design Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Diagnosis, Bad Planning&Reasoning. Treatment, SCOPE -- Planning for Hybrid Querying over Clinical Trial Data

Related Papers