MetaXMar 17, 2026arXiv:2603.16733

IQuest-Coder-V1 Technical Report

Jian Yang, Wei Zhang, Shawn Guo, Zhengmao Ye, Lin Jing, Shark Liu, Yizhi Li, Jiajun Wu, Ce Liu, Cening Liu, X. Ma, Yu Song, Yuyang Song, Siwei Wu, Yuwen Li, L. Liao, T. Zheng, Ziling Huang, Ze-Jiang Huang, Zelong Huang, Che Liu, Yangwei Xing, Yan Xing, Renyuan Li, Qing Cai, Qingsong Cai, Hanxu Yan, Han Yan, Siyue Wang, Shikai Li, Jason Klein Liu, Anji Huang, An Huang, Yongsheng Kang, Jinxin Zhang, Jinxing Zhang, Chuan Hao, Haowen Wang, Wei-Quan Gu, Weicheng Gu, Ran Tao, Mingjie Tang, Peihao Wu, Jianzhou Wang, Xianglong Liu, Weifeng Lv, Bryan Dai

AI Summary

IQuest-Coder-V1 is a new family of code LLMs (7B/14B/40B/40B-Loop) trained using a novel "code-flow multi-stage training paradigm" that captures the dynamic evolution of software logic. This paradigm involves pre-training on code facts, repository, and completion data, followed by mid-training integrating reasoning and agentic trajectories in long contexts (32k-128k), and finally post-training with reasoning-driven RL and instruction optimization. The resulting models achieve state-of-the-art performance in agentic software engineering, competitive programming, and complex tool use, with a "Loop" variant optimizing for deployment footprint via a recurrent mechanism.

Key Contribution

Code LLMs can achieve SOTA performance in agentic tasks by explicitly modeling the dynamic evolution of software logic across different training stages.

Abstract

In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow multi-stage training paradigm, which captures the dynamic evolution of software logic through different phases of the pipeline. Our models are developed through the evolutionary pipeline, starting with the initial pre-training consisting of code facts, repository, and completion data. Following that, we implement a specialized mid-training stage that integrates reasoning and agentic trajectories in 32k-context and repository-scale in 128k-context to forge deep logical foundations. The models are then finalized with post-training of specialized coding capabilities, which is bifurcated into two specialized paths: the thinking path (utilizing reasoning-driven RL) and the instruct path (optimized for general assistance). IQuest-Coder-V1 achieves state-of-the-art performance among competitive models across critical dimensions of code intelligence: agentic software engineering, competitive programming, and complex tool use. To address deployment constraints, the IQuest-Coder-V1-Loop variant introduces a recurrent mechanism designed to optimize the trade-off between model capacity and deployment footprint, offering an architecturally enhanced path for efficacy-efficiency trade-off. We believe the release of the IQuest-Coder-V1 series, including the complete white-box chain of checkpoints from pre-training bases to the final thinking and instruction models, will advance research in autonomous code intelligence and real-world agentic systems.

Architecture Design (Transformers, SSMs, MoE)Code Generation & Program Synthesis Open-Source Models & Weights

Citation Metrics

Citations0

Influential citations0

References28

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

IQuest-Coder-V1 Technical Report

Related Papers