SJTUMay 5, 2026arXiv:2605.04036

OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

Yuwen Du, Rui Ye, Shuo Tang, Keduan Huang, Xinyu Zhu, Yuzhu Cai, Siheng Chen

AI Summary

The authors demonstrate that a simple supervised fine-tuning (SFT) approach can achieve state-of-the-art performance for training search agents, rivaling more complex pipelines involving continual pre-training and reinforcement learning. This is achieved by synthesizing informative and high-difficulty trajectories through scaling knowledge graph size, expanding tool set size, and strict low-step filtering. Trained on only 10.6k data points, OpenSeeker-v2 surpasses the performance of Tongyi DeepResearch across four benchmarks, establishing a new SOTA for 30B-sized agents using the ReAct paradigm.

Key Contribution

Forget resource-intensive pipelines: a purely academic team achieves SOTA search agent performance with just 10.6k SFT data points, outperforming models trained with CPT+SFT+RL.

Abstract

Deep search capabilities have become an indispensable competency for frontier Large Language Model (LLM) agents, yet their development remains dominated by industrial giants. The typical industry recipe involves a highly resource-intensive pipeline spanning pre-training, continual pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL). In this report, we show that when fueled with informative and high-difficulty trajectories, a simple SFT approach could be surprisingly powerful for training frontier search agents. By introducing three simple data synthesis modifications: scaling knowledge graph size for richer exploration, expanding the tool set size for broader functionality, and strict low-step filtering, we establish a stronger baseline. Trained on merely 10.6k data points, our OpenSeeker-v2 achieves state-of-the-art performance across 4 benchmarks (30B-sized agents with ReAct paradigm): 46.0% on BrowseComp, 58.1% on BrowseComp-ZH, 34.6% on Humanity's Last Exam, and 78.0% on xbench, surpassing even Tongyi DeepResearch trained with heavy CPT+SFT+RL pipeline, which achieves 43.4%, 46.7%, 32.9%, and 75.0%, respectively. Notably, OpenSeeker-v2 represents the first state-of-the-art search agent within its model scale and paradigm to be developed by a purely academic team using only SFT. We are excited to open-source the OpenSeeker-v2 model weights and share our simple yet effective findings to make frontier search agent research more accessible to the community.

Eval Frameworks & Benchmarks Open-Source Models & Weights Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References12

Year2026

VenueN/A

Related Papers

Finding related papers...