Jayanth Srinivasa

Injecting demonstrations with a carefully annealed probability can drastically improve exploration in RLVR, even for tasks requiring novel reasoning or domain-specific knowledge.

Saaket Agashe, Jayanth Srinivasa, Gaowen Liu +4

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents+1

An Luo +16Mar 19, 2026

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

Despite advances in LLMs, human-AI collaboration still significantly outperforms AI-only agents in domain-specific data science tasks, proving that human expertise remains crucial.

An Luo, Jin Du, Xun Xian +14

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Aug 28, 2025

Venkatesh Mishra +7Aug 28, 2025

How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench

LLMs struggle to consistently use tools in dynamic environments, but a simple input reformulation strategy can boost performance by over 16% compared to standard methods like ReAct.

Venkatesh Mishra, Amir Saeidi, Satyam Raj +5

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Search

Jayanth Srinivasa

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)