Arman Cohan

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (4)Eval Frameworks & Benchmarks (3)Natural Language Processing (2)Reasoning & Chain-of-Thought (1)

Frequent co-authors

Yilun Zhao (4)Manasi Patwardhan (3)Jinbiao Wei (2)Sihong Wu (2)

Papers (5)

May 5, 2026

Yilun Zhao +5May 5, 2026

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Standard retriever evaluations hide critical weaknesses in agentic search systems, but a new benchmark and training method exposes and addresses these flaws.

Yilun Zhao, Jinbiao Wei, Tingyu Song +3

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

Apr 30, 2026

Sihong Wu +8Apr 30, 2026·also Yale

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

LLMs are rapidly transforming peer review, but critical gaps remain in ensuring quality, fairness, and ethical considerations across the entire workflow.

Sihong Wu, Owen Jiang, Yilun Zhao +6

Eval Frameworks & Benchmarks Natural Language Processing Tool Use & Agents

Apr 29, 2026

Jinbiao Wei +4Apr 29, 2026

Step-level Optimization for Efficient Computer-use Agents

Frontier models are wasted on routine GUI tasks: a step-level cascade that adaptively invokes stronger models only when lightweight monitors detect progress stalls or semantic drift slashes compute costs without sacrificing performance.

Jinbiao Wei, Kangqi Ni, Yilun Zhao +2

Inference & Quantization Tool Use & Agents Training Efficiency & Optimization

Mar 10, 2026

Mar 10, 2026·also (Corresponding author: Rui Meng and Xiaodong

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Stop generating superficial reviews: RbtAct leverages rebuttals to train LLMs to provide actionable feedback, leading to concrete revisions and improved author uptake.

Sihong Wu, Yiling Ma, Yi Ma +6

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Feb 16, 2026

TCS ResearchFeb 16, 2026·also Yale

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

Even GPT-5 struggles to reliably reproduce novel research findings, highlighting a significant gap between capability and reliability for AI agents tackling end-to-end research tasks.

Aniketh Garikaparthi, Manasi Patwardhan, Arman Cohan

Eval Frameworks & Benchmarks Scientific Discovery & Drug Design Tool Use & Agents