May 11 – May 18, 2026

Reasoning & Chain-of-Thought - Weekly Roundup

2 papers published across 1 lab.

397% acceleration

Selected Labs publishing this week

Amazon Science1

Top Papers

May 18, 2026

1w ago·also Amazon Science, MiroMind, UT Austin

Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search

LLMs can now automatically slim down and future-proof mathematical proofs, achieving 70% compression and 60% faster compilation by strategically rewriting them.

Jialin Lu, Soonho Kong, Rodrigo Stehling +4

Code Generation & Program Synthesis Reasoning & Chain-of-Thought Tool Use & Agents

May 11, 2026

2w ago

Unsupervised Process Reward Models

Forget expensive human annotations: this unsupervised method trains reward models that steer LLM reasoning just as well as, or even better than, their supervised counterparts.

Artyom Gadetsky, M. Kodryan, Siba Smarak Panigrahi +2

Reasoning & Chain-of-Thought RLHF & Preference Learning