Search papers, labs, and topics across Lattice.
This study analyzes 7,416 reviewer-bot comments on 4,532 agentic pull requests (PRs) from the AI_Dev dataset to understand the impact of reviewer-bot feedback on PR acceptance and resolution. The analysis reveals that while reviewer bots focus on bug fixes, testing, and documentation with clear and concise feedback, the semantic relevance of comments is only moderate. The key finding is that higher reviewer bot activity volume correlates with longer PR resolution times and lower average feedback quality, suggesting a need for more targeted feedback.
More reviewer bot comments on agentic pull requests actually *increase* resolution time, suggesting that quality trumps quantity in automated code review.
Autonomous coding agents are reshaping software development by creating pull requests (PRs) on GitHub, referred to as agentic PRs. In parallel, the review process is also becoming autonomous, thereby making reviewer bots key actors in the assessment of these agentic PRs. However, their influence on PR acceptance and resolution remains unclear. This study empirically investigates the relationship between reviewer-bot feedback and PR outcomes by analyzing how Reviewer Bot Feedback Quality (relevance, clarity, conciseness) and Reviewer Bot Activity Volume (comment count) are associated with PR acceptance and resolution time. We analyze 7,416 reviewer-bot comments on 4,532 PRs from the AI_Dev dataset (a dataset that captured AI agents'PRs in GitHub projects). Our results show that reviewer-bot comments mainly focus on bug fixes, testing, and documentation, are civil in tone, and are prescriptive in nature. Reviewer bots generally produce clear and concise feedback, though the semantic relevance of comments to underlying code changes is moderate. We find that higher Reviewer Bot Activity volume is associated with longer PR resolution times and lower average feedback quality, showing that as bots generate more comments on a PR, the average pertinence of that feedback appears to degrade. At the same time, Reviewer Bot Feedback Quality shows no meaningful association with workflow outcomes. Our findings suggest that, in agentic PR workflows, reviewer bots should prioritize targeted high-relevance feedback over generating large numbers of comments.