Ahmed Awadallah

Research focus

Tool Use & Agents (2)RLHF & Preference Learning (1)Training Efficiency & Optimization (1)Constitutional AI & AI Ethics (1)

Frequent co-authors

Karan Gupta (1)Pranav Vajreshwari (1)Yash Pandya (1)Raghav Magazine (1)

Papers (2)

Mar 5, 2026

Microsoft ResearchMar 5, 2026

Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces

A 4B parameter SLM can now rival frontier agent performance in complex tool-use environments, thanks to a novel reinforcement finetuning framework that teaches it how to strategically acquire context and execute actions.

Karan Gupta, Pranav Vajreshwari, Yash Pandya +3

RLHF & Preference Learning Tool Use & Agents Training Efficiency & Optimization

Mar 3, 2026

Aradhye Agarwal +6Mar 3, 2026·also Microsoft Research

Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Agentic LLMs can be taught to refuse harmful actions with up to 50% greater success, even zero-shot across diverse models and tasks, by explicitly learning when *not* to act.

Aradhye Agarwal, Gurdit Siyan, Yash Pandya +4

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Search

Ahmed Awadallah

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)