Search papers, labs, and topics across Lattice.
The paper introduces DrugPilot, an LLM-based agent system designed for end-to-end drug discovery workflows, addressing limitations in data processing, task automation, and tool support. DrugPilot employs a parameterized reasoning architecture with a novel memory pool to standardize heterogeneous data, enabling efficient multi-turn dialogue and complex decision-making. Experiments on a newly constructed drug instruction dataset demonstrate that DrugPilot significantly outperforms existing agents like ReAct and LoT in task completion rates across simple, multi-tool, and multi-turn scenarios.
DrugPilot blows away ReAct and LoT on drug discovery tasks, achieving near-perfect completion rates in simple and multi-tool scenarios by using a parameterized memory pool to standardize heterogeneous data.
Large language models (LLMs) integrated with autonomous agents hold significant potential for advancing scientific discovery through automated reasoning and task execution. However, applying LLM agents to drug discovery is still constrained by challenges such as large-scale multimodal data processing, limited task automation, and poor support for domain-specific tools. To overcome these limitations, we introduce DrugPilot, a LLM-based agent system with a parameterized reasoning architecture designed for end-to-end scientific workflows in drug discovery. DrugPilot enables multi-stage research processes by integrating structured tool use with a novel parameterized memory pool. The memory pool converts heterogeneous data from both public sources and user-defined inputs into standardized representations. This design supports efficient multi-turn dialogue, reduces information loss during data exchange, and enhances complex scientific decision-making. To support training and benchmarking, we construct a drug instruction dataset covering eight core drug discovery tasks. Under the Berkeley function-calling benchmark, DrugPilot significantly outperforms state-of-the-art agents such as ReAct and LoT, achieving task completion rates of 98.0%, 93.5%, and 64.0% for simple, multi-tool, and multi-turn scenarios, respectively. These results highlight DrugPilot's potential as a versatile agent framework for computational science domains requiring automated, interactive, and data-integrated reasoning.