ByteDanceStony BrookUNSWMay 5, 2026arXiv:2605.03571

PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

Qiyao Wang, Qiyao Wang, Xinyi Chen, Xinyi Chen, Longze Chen, Longze Chen, Hongbo Wang, Hongbo Wang, H. Alinejad-Rokny, Hamid Alinejad-Rokny, Yuan Lin, Yuan Lin, Min Yang

AI Summary

The paper introduces PatRe, a novel benchmark designed to model the full lifecycle of patent examination, encompassing both Office Action generation and applicant rebuttal. This benchmark consists of 480 real-world patent cases and supports both oracle and retrieval-simulated evaluation settings to better reflect the iterative nature of patent examination. Experiments using various LLMs reveal performance disparities between proprietary and open-source models and task asymmetries between examiner analysis and applicant rebuttal, highlighting current limitations in complex legal reasoning and technical novelty judgment.

Key Contribution

LLMs struggle to navigate the complex, multi-turn justification and response dynamics of real-world patent examination, revealing critical gaps in legal reasoning and technical novelty judgment.

Abstract

Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination

Related Papers