Vaneet Aggarwal

Learning policies from just one trajectory in average-reward MDPs is now feasible, with guarantees that could transform how we approach sample efficiency in reinforcement learning.

Jongmin Lee, Ernest K. Ryu, Vaneet Aggarwal

World Models & Planning

Apr 1, 2026

Mudita Sharma +7Apr 1, 2026·also Purdue

Lipschitz Dueling Bandits over Continuous Action Spaces

Achieving near-optimal regret in continuous dueling bandits is now possible with just logarithmic space complexity, opening the door to efficient exploration in complex comparative decision-making problems.

Mudita Sharma, Mudit Sharma, Shweta Jain +5

Recommendation & Information Retrieval Robotics & Embodied AI

Search

Vaneet Aggarwal

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)