Feb 19, 2026arXiv:2602.17103

Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

AI Summary

This paper extends the "learning with improvements" model, where agents can modify features to obtain better labels, to multiclass settings, bandit feedback scenarios, and agents with budget constraints on improvements. The work provides combinatorial characterizations of online learnability in these extended settings, offering a more comprehensive theoretical understanding of learning with improving agents. The authors derive bounds on the number of mistakes made by online learning algorithms in these scenarios, demonstrating the feasibility of learning with improvements under various practical constraints.

Key Contribution

Agents that can strategically tweak their features to get better outcomes change the game for online learning, and this paper delivers the theoretical tools to understand it.

Abstract

We investigate the recently introduced model of learning with improvements, where agents are allowed to make small changes to their feature values to be warranted a more desirable label. We extensively extend previously published results by providing combinatorial dimensions that characterize online learnability in this model, by analyzing the multiclass setup, learnability in a bandit feedback setup, modeling agents' cost for making improvements and more.

Natural Language Processing Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

Related Papers