Search papers, labs, and topics across Lattice.
The paper introduces AIDev, a large-scale dataset of 932,791 agent-authored pull requests (Agentic-PRs) from GitHub, created by agents like OpenAI Codex, Devin, GitHub Copilot, Cursor, and Claude Code. This dataset addresses the lack of real-world data on AI coding agent usage and its impact on software development. AIDev provides a foundation for studying AI adoption, developer productivity, and human-AI collaboration, including a curated subset of 33,596 Agentic-PRs from popular repositories with additional contextual information.
Discover how AI coding agents are *actually* being used in real-world software projects with a new dataset of nearly one million agent-authored pull requests.
AI coding agents are rapidly transforming software engineering by performing tasks such as feature development, debugging, and testing. Despite their growing impact, the research community lacks a comprehensive dataset capturing how these agents are used in real-world projects. To address this gap, we introduce AIDev, a large-scale dataset focused on agent-authored pull requests (Agentic-PRs) in real-world GitHub repositories. AIDev aggregates 932,791 Agentic-PRs produced by five agents: OpenAI Codex, Devin, GitHub Copilot, Cursor, and Claude Code. These PRs span 116,211 repositories and involve 72,189 developers. In addition, AIDev includes a curated subset of 33,596 Agentic-PRs from 2,807 repositories with over 100 stars, providing further information such as comments, reviews, commits, and related issues. This dataset offers a foundation for future research on AI adoption, developer productivity, and human-AI collaboration in the new era of software engineering.>AI Agent, Agentic AI, Coding Agent, Agentic Coding, Agentic Software Engineering, Agentic Engineering