Feb 26, 2026arXiv:2602.23242

A Model-Free Universal AI

Yegon Kim, Yegon Kim, Juho Lee, Juho Lee

AI Summary

This paper introduces Universal AI with Q-Induction (AIQI), the first model-free reinforcement learning agent proven to be asymptotically ε-optimal in general RL. AIQI achieves this by performing universal induction directly over distributional action-value functions, circumventing the need for explicit environment models used by previous universal agents like AIXI. The authors prove that AIQI is strong asymptotically ε-optimal and asymptotically ε-Bayes-optimal under a grain of truth condition, demonstrating a novel approach to universal AI.

Key Contribution

Model-free reinforcement learning can achieve asymptotic optimality: AIQI learns without environment models by directly inducing action-value functions.

Abstract

In general reinforcement learning, all established optimal agents, including AIXI, are model-based, explicitly maintaining and using environment models. This paper introduces Universal AI with Q-Induction (AIQI), the first model-free agent proven to be asymptotically $\varepsilon$-optimal in general RL. AIQI performs universal induction over distributional action-value functions, instead of policies or environments like previous works. Under a grain of truth condition, we prove that AIQI is strong asymptotically $\varepsilon$-optimal and asymptotically $\varepsilon$-Bayes-optimal. Our results significantly expand the diversity of known universal agents.

Scalable Oversight & Alignment Theory World Models & Planning

Citation Metrics

Citations0

Influential citations0

References54

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Model-Free Universal AI

Related Papers