Search papers, labs, and topics across Lattice.
This paper introduces Agents-K1, a comprehensive knowledge orchestration pipeline designed to enhance scientific reasoning by converting raw documents into agent-native scientific knowledge graphs. The approach integrates a multimodal parser, a 4B information-extraction backbone, and a tri-source agent interface, processing over 2.46 million scientific papers to create the Scholar-KG dataset. Key results show that Agents-K1 significantly outperforms existing methods in scientific information extraction and multi-hop reasoning tasks, thereby addressing critical gaps in current LLM-based research agents' capabilities.
Agents-K1 transforms how we extract and reason about scientific knowledge, achieving superior performance in multi-hop reasoning tasks compared to existing methods.
Current LLM-based research agents have advanced through agent orchestration, yet largely overlook scientific knowledge orchestration. Existing works often reduce papers to abstracts, surface mentions, and flat \texttt{cites} edges, omitting key entities, claims, evidence, mechanisms, and method lineages essential for scientific reasoning. To this end, we introduce \textbf{Agents-K1}, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs. Agents-K1 integrates three components under a unifying theoretical foundation: a multimodal parser whose five-module schema captures entities, multimodal evidence, citations, and typed inter-entity relations across the full paper rather than abstracts alone; a 4B information-extraction backbone trained with GRPO under a rule-based reward; and a graphanything CLI, a tri-source agent interface that unifies web search, multimodal graph retrieval, and cross-document traversal. On top of this, we process 2.46 million scientific papers across six subjects to produce \textbf{Scholar-KG}, of which we release a one-million-paper subset, and the full Scholar-KG is accessible via the SCP link below. The same pipeline can be extended to general-domain corpora and to schema-conformant data synthesis. Extensive experiments demonstrate that Agents-K1 achieves superior performance in scientific information extraction, knowledge graph construction, and multi-hop scientific reasoning.