CambridgeTU MunichTuring InstituteMay 25, 2026arXiv:2605.26081

VeriTrace: Evolving Mental Models for Deep Research Agents

Haolang Zhao, Yunbo Long, Lukas Beckenbauer, Alexandra Brintrup

AI Summary

The paper introduces VeriTrace, a cognitive-graph framework for deep research agents that explicitly regulates the evolution of the agent's mental model through three feedback loops: interpretive update, deviation feedback, and schema revision. This approach addresses the problem of error propagation and mixed-quality information contaminating intermediate representations in existing systems that rely on implicit LLM reasoning. Experiments using Qwen3.5-27B show VeriTrace improves performance over strong baselines on DeepResearch Bench (DRB) and DeepConsult, and achieves state-of-the-art open-source results on DRB with Config-DeepSeek.

Key Contribution

Explicitly regulating a research agent's mental model with feedback loops beats relying on implicit LLM reasoning, leading to significant gains on complex research tasks.

Abstract

Deep research agents face vast, interdependent, and pervasively uncertain information. Existing systems explore what evolving intermediate representations should look like, but leave their evolution to the LLM's implicit reasoning. Without explicit regulation, the intermediate layer is easily contaminated by mixed-quality information and propagates errors along its dependencies, so model scale often ends up substituting for absent regulation. We argue that an agent's mental model should instead evolve through explicit feedback that continuously aligns task understanding with reality, and identify three regulatory loops: interpretive update, deviation feedback, and schema revision. We realise this in VeriTrace, a cognitive-graph framework that explicitly implements the three loops. Using matched Qwen3.5-27B backbones, VeriTrace improves over the strongest matched baseline by 4.22 pp on DeepResearch Bench (DRB) Insight (1.49 pp Overall) and by 5.9 pp Overall win rate on DeepConsult. With Config-DeepSeek, it achieves the strongest reproducible open-source result on DRB.

Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

VeriTrace: Evolving Mental Models for Deep Research Agents

Related Papers