Aradhye Agarwal

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Constitutional AI & AI Ethics (1)Red-Teaming & Adversarial Robustness (1)Tool Use & Agents (1)

Frequent co-authors

Gurdit Siyan (1)Yash Pandya (1)Joykirat Singh (1)Akshay Nambi (1)

Papers (1)

Mar 3, 2026

Aradhye Agarwal +6Mar 3, 2026·also Microsoft Research

Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Agentic LLMs can be taught to refuse harmful actions with up to 50% greater success, even zero-shot across diverse models and tasks, by explicitly learning when *not* to act.

Aradhye Agarwal, Gurdit Siyan, Yash Pandya +4

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Search

Aradhye Agarwal

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)