Search papers, labs, and topics across Lattice.
This paper extends the "learning with improvements" model, where agents can modify features to obtain better labels, to multiclass settings, bandit feedback scenarios, and agents with budget constraints on improvements. The work provides combinatorial characterizations of online learnability in these extended settings, offering a more comprehensive theoretical understanding of learning with improving agents. The authors derive bounds on the number of mistakes made by online learning algorithms in these scenarios, demonstrating the feasibility of learning with improvements under various practical constraints.
Agents that can strategically tweak their features to get better outcomes change the game for online learning, and this paper delivers the theoretical tools to understand it.
We investigate the recently introduced model of learning with improvements, where agents are allowed to make small changes to their feature values to be warranted a more desirable label. We extensively extend previously published results by providing combinatorial dimensions that characterize online learnability in this model, by analyzing the multiclass setup, learnability in a bandit feedback setup, modeling agents' cost for making improvements and more.