Mar 11, 2025arXiv:2503.08569

DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process

Minjun Zhu, Yixuan Weng, Linyi Yang, Yue Zhang

AI Summary

The paper introduces DeepReview, a multi-stage framework for LLM-based paper review that mimics expert reviewers by incorporating structured analysis, literature retrieval, and evidence-based argumentation. They train DeepReviewer-14B on a new dataset, DeepReview-13K, containing structured annotations. The resulting DeepReviewer-14B outperforms CycleReviewer-70B and achieves win rates of 88.21% and 80.20% against GPT-o1 and DeepSeek-R1, respectively, demonstrating improved performance in automated paper review.

Key Contribution

A 14B parameter model can beat much larger LLMs at paper review by mimicking human expert reviewers' structured analysis and evidence-based reasoning.

Abstract

Large Language Models (LLMs) are increasingly utilized in scientific research assessment, particularly in automated paper review. However, existing LLM-based review systems face significant challenges, including limited domain expertise, hallucinated reasoning, and a lack of structured evaluation. To address these limitations, we introduce DeepReview, a multi-stage framework designed to emulate expert reviewers by incorporating structured analysis, literature retrieval, and evidence-based argumentation. Using DeepReview-13K, a curated dataset with structured annotations, we train DeepReviewer-14B, which outperforms CycleReviewer-70B with fewer tokens. In its best mode, DeepReviewer-14B achieves win rates of 88.21\% and 80.20\% against GPT-o1 and DeepSeek-R1 in evaluations. Our work sets a new benchmark for LLM-based paper review, with all resources publicly available. The code, model, dataset and demo have be released in http://ai-researcher.net.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Citation Metrics

Citations48

Influential citations3

References72

Year2025

VenueAnnual Meeting of the Association for Computational Linguistics

Related Papers

Finding related papers...

Search

DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process

Related Papers