Vahab Mirrokni

Recurrent models can now achieve Transformer-competitive performance on recall-intensive tasks, thanks to a simple memory caching mechanism that grows memory capacity with sequence length.

Ali Behrouz, Zeman Li, Yuan Deng +3

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization

Feb 24, 2026

DeepMindFeb 24, 2026·also CMU ML, Google Research, USC

Aletheia tackles FirstProof autonomously

Gemini 3 Deep Think can now autonomously solve a majority of problems in a challenging math competition, signaling a leap in AI's mathematical reasoning capabilities.

Tony Feng, Tony Feng, Junehyuk Jung +26

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Feb 23, 2026

Google ResearchFeb 23, 2026·also EPFL

Less is More: Convergence Benefits of Fewer Data Weight Updates over Longer Horizon

Surprisingly, using only a single inner loop update in data mixing can lead to failure, and the optimal number of inner loop steps scales logarithmically with the parameter update budget.

Rudrajit Das, Neel Patel, Meisam Razaviyayn +1

Training Efficiency & Optimization

Search

Vahab Mirrokni

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)