HUSTSMUMar 30, 2026arXiv:2603.28592

Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild

Yue Liu, Ratnadira Widyasari, Yanjie Zhao, I. Irsan, Ivana Clairine Irsan, David Lo

AI Summary

This paper investigates the technical debt introduced by AI coding assistants by analyzing 304,362 AI-authored commits across 6,275 GitHub repositories. They used static analysis to identify and track code smells, bugs, and security issues introduced by five popular AI coding assistants. The study reveals that AI-generated code introduces a substantial number of issues, with 24.2% of them persisting in the latest repository revisions, indicating long-term maintenance costs.

Key Contribution

AI coding assistants are racking up technical debt in real-world projects, with nearly a quarter of the code quality issues they introduce sticking around long-term.

Abstract

AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 304,362 verified AI-authored commits from 6,275 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis before and after the change to precisely attribute which code smells, bugs, and security issues the AI introduced. We then track each introduced issue from the introducing commit to the latest repository revision to study its lifecycle. Our results show that we identified 484,606 distinct issues, and that code smells are by far the most common type, accounting for 89.1% of all issues. We also find that more than 15% of commits from every AI coding assistant introduce at least one issue, although the rates vary across tools. More importantly, 24.2% of tracked AI-introduced issues still survive at the latest revision of the repository. These findings show that AI-generated code can introduce long-term maintenance costs into real software projects and highlight the need for stronger quality assurance in AI-assisted development.

Code Generation & Program Synthesis Eval Frameworks & Benchmarks

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild

Related Papers