Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Radha Poovendran | Lattice

Radha Poovendran

Papers on Lattice

4

Total citations

6

Topics

8

h-index

13

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (4)Tool Use & Agents (1)Scalable Oversight & Alignment Theory (1)Scientific Discovery & Drug Design (1)

Frequent co-authors

Yuetai Li (2)Website GitHub (1)HuggingFace Leaderboard (1)Yiyou Sun (1)

Papers (4)

Jun 3, 2026

Tsinghua AI1w ago·also Department of Computer Science, Georgia Tech, KU Leuven, PKU +6

Agents'Last Exam

The hardest AI tasks remain largely unsolved, with current models achieving only a 2.6% success rate on economically valuable workflows.

Website GitHub, HuggingFace Leaderboard, Yiyou Sun +306

Eval Frameworks & Benchmarks Tool Use & Agents

UW1w ago·also MIT CSAIL, Stanford HAI, Notre Dame, Princeton +1

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

Success in long-horizon tasks hinges more on an agent's iterative persistence than on the quality of its initial solution.

Zhangchen Xu, Junda Chen, Yue Huang +15

Eval Frameworks & Benchmarks Scalable Oversight & Alignment Theory Scientific Discovery & Drug Design

May 26, 2026

UW2w ago·also AI2, Microsoft Research, NUS, Cambricon Technologies +8

The Strongest Teacher Is Not Always the Best Teacher: Student-Centric Answer Selection

The best LLM to answer a question isn't always the best LLM to *teach* the answer, and matching the "difficulty" of the explanation to the student's current abilities yields better learning.

Zhengyu Hu, Zheyuan Xiao, Linxin Song +11

Data Curation & Synthetic Data Eval Frameworks & Benchmarks RLHF & Preference Learning

May 27, 2025

UWMay 27, 2025·also SambaNova, University of Georgia

SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge

Despite claims of safety alignment, state-of-the-art LLMs still spill the beans on hazardous scientific knowledge at an alarming rate, failing nearly 80% of the time on a new regulation-grounded benchmark.

Fengqing Jiang, Fengbo Ma, Zhangchen Xu +76

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Data Curation & Synthetic Data (1)

RLHF & Preference Learning (1)

Xinyan Han (1)

Weichen Zhang (1)

Tianyu Wang (1)