Search papers, labs, and topics across Lattice.
This study investigates the effectiveness of AI-generated reviews in enhancing the drafting process of academic papers, specifically focusing on 20 submissions in computer architecture. By developing the AI-Paper-Review tool, the authors quantify the alignment between AI and human reviews through a structured analysis of comments, revealing that AI can identify a substantial portion of issues raised by human reviewers while also uncovering additional concerns. The findings suggest that while AI review has potential to improve paper quality, ethical considerations surrounding its application in peer review remain critical.
AI reviews can identify significant issues in paper drafts that human reviewers might overlook, raising important questions about the future of academic peer review.
Research is advancing faster than ever with artificial intelligence (AI); and so are the corresponding research papers. The exploding volume of AI-generated papers have put a strain to peer review, leading to the usage of AI-generated review, potentially wide yet sneaky. However, relevant ethical concerns about confidentiality, quality, and fairness are raised and no consensus has been reached in the broad research community. We expect the debate to continue for a while, but in the meantime, we ask an alternative, practical question: \textit{can AI review improve paper drafting?} We study 20 computer architecture papers, with varying levels of submission lineage, to expose how well AI review aligns with human review, quantified by a set of metrics we define. To conduct the case study, we build a web UI-integrated tool, \emph{AI-Paper-Review}, that generates structured AI review of a draft paper, available at https://github.com/unarylab/ai-paper-review. This tool selects several AI reviewers from a diverse pool of AI reviewers and clusters and ranks their comments based on commonality and importance of review comments. It also allows to align AI comments with human comments to facilitate metric-based validation. The case study shows that AI review can cover a significant fraction of human-raised issues, but also raises issues missing in human review. This paper is not intended to encourage using AI for peer review at the current stage, but to study that (1) how AI review can improve paper drafting and (2) the potential and limitation of AI-based peer review. The release of the tool and the case study data is intended to instigate future research on this topic. Misuse for peer review would violate the ethics policies from major academic venues.