Search papers, labs, and topics across Lattice.
This paper investigates the use of LLM-based agents to identify bug-introducing commits from fix commits in software repositories. They propose an agentic workflow that searches candidate commits using patterns derived from fix commit diffs and messages. The results show a significant improvement in F1-score from 0.64 to 0.81 on the Linux kernel dataset, surpassing previous state-of-the-art methods.
LLM agents leapfrog traditional methods for identifying bug-introducing commits, boosting F1-score by 17 points by intelligently searching for patterns in code changes.
\'Sliwerski, Zimmermann, and Zeller (SZZ) just won the 2026 ACM SIGSOFT Impact Award for asking: When do changes induce fixes? Their paper from 2005 served as the foundation for a wide array of approaches aimed at identifying bug-introducing changes (or commits) from fix commits in software repositories. But even after two decades of progress, the best-performing approach from 2025 yields a modest increase of 10 percentage points in F1-score on the most popular Linux kernel dataset. In this paper, we uncover how and why LLM-based agents can substantially advance the state-of-the-art in identifying bug-introducing commits from fix commits. We propose a simple agentic workflow based on searching a set of candidate commits and find that it raises the F1-score from 0.64 to 0.81 on the most popular Linux kernel dataset, a bigger jump than between the original 2005 method (0.54) and the previous SOTA (0.64). We also uncover why agents are so successful: They derive short greppable patterns from the fix commit diff and message and use them to effectively search and find bug-introducing commits in large candidate sets. Finally, we also discuss how these insights might enable further progress in bug detection, root cause understanding, and repair.