Hilary Greaves

University of Oxford Corresponding author: alan.chan@governance.aiWork completed as seasonal fellows at GovAI.Equal contribution. Authors are free to list themselves as first author on their CVs. Senior author. Abstract The automation of AI R&D (AIRDA) could have significant implications, but its extent and ultimate effects remain uncertain. We need empirical data to resolve these uncertainties, but existing data—primarily capability benchmarks—may not reflect real-world automation or capture its broader consequences, such as whether AIRDA accelerates capabilities more than safety progress or whether our ability to oversee AI R&D can keep pace with its acceleration. To address these gaps, this work proposes metrics to track the extent of AIRDA and its effects on AI progress and oversight. The metrics span dimensions such as capital share of AI R&D spending, researcher time allocation, and AI subversion incidents, and could help decision makers understand the potential consequences of AIRDA, implement appropriate safety measures, and maintain awareness of the pace of AI development. We recommend that companies and third parties (e.g. non-profit research organisations) start to track these metrics, and that governments support these efforts. 1 Introduction Frontier AI companies aim to automate AI R&D. OpenAI’s CEO Sam Altman expects to have an automated AI researcher by 2028 (Bellan, 2025), while Google DeepMind’s CEO Demis Hassabis predicted in 2025 that an automated AI researcher was “a few years away” (Perrigo, 2025). According to Anthropic’s CEO Dario Amodei, current AI systems are “good enough at coding that some of the strongest engineers [he has] ever met are now handing over almost all their coding to AI” (Amodei, 2026). Many recent products from these companies—such as Codex, AntiGravity, and Claude Code—focus on automating software engineering, a key component of AI R&D. Adoption is widespread: 47% of developers use AI tools daily (Stack Overflow, 2025) and 50% of new code at Google is AI-generated (Alphabet, 2026). Some qualitative reports also suggest that these tools can perform some of the R&D tasks expected of junior AI researchers (Anthropic, 2025b; 2026b). The effects of AI R&D automation (AIRDA) could be significant, but might cut in different directions (Figure˜1). AIRDA could accelerate AI progress, bringing forward AI’s benefits but also hastening the arrival of destructive capabilities, including those related to weapons of mass destruction, or other forms of disruption such as unemployment (Bengio et al., 2026). Whether this acceleration would be overall beneficial remains unclear. Key uncertainties include whether AIRDA would accelerate defensive capabilities more than offensive ones, whether safety research will keep pace with capabilities research, and whether the human institutions can adapt to the accelerated pace of progress (Bernardi et al., 2025; Vaintrob and Cotton-Barratt, 2025; Kembery and Ammann, 2025). Figure 1: The extent of AI R&D automation could affect both AI progress and the oversight gap: the difference between how much oversight is needed (“oversight demand”) and how much oversight is actually achieved. Oversight capacity is the ability to achieve oversight, encompassing both the ability to understand the R&D process (e.g., having sufficient expertise) and the resources available for exercising control (e.g., human labour, monitoring tools). AI progress could also affect the oversight gap, such as by increasing the stakes of R&D decisions. This work proposes metrics to track all of these quantities. AIRDA could also affect oversight of the AI R&D process. By reducing the need for human researchers, AIRDA could concentrate control over AI development into fewer hands and decrease societal oversight (Davidson et al., 2025). AI systems could also increase the need for oversight by introducing more frequent errors into the R&D process, potentially leading to loss of control incidents (Benton et al., 2024; Ward et al., 2025; Bengio et al., 2026). On the other hand, AIRDA could potentially improve our ability to perform oversight, such as by enabling new AI-assisted oversight tools (Choi et al., 2024). As an analogy, higher-level programming languages let developers write more complex software that still behaves (roughly) as intended

Anthropic

Papers on Lattice

Total citations

Topics