Weihang Su

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (3)Natural Language Processing (3)Data Curation & Synthetic Data (2)Recommendation & Information Retrieval (1)Tool Use & Agents (1)

Frequent co-authors

Qingyao Ai (4)Jianming Long (3)Yiqun Liu (3)Yichen Tang (2)

Papers (4)

Apr 27, 2026

Tsinghua AIApr 27, 2026

Skill Retrieval Augmentation for Agentic AI

Explicitly enumerating skills in-context doesn't scale for agentic LLMs, but retrieving skills on demand can substantially improve performance – if the LLM can figure out when and which skill to load.

Weihang Su, Jianming Long, Qingyao Ai +4

Recommendation & Information Retrieval Tool Use & Agents

Oct 29, 2025

Tsinghua AIOct 29, 2025·also Fudan, Rutgers

TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation

LLMs still can't convincingly mimic human personas, especially when it comes to syntactic style and memory, despite advancements in other areas.

Bangde Du, Minghao Guo, Songming He +8

Eval Frameworks & Benchmarks Natural Language Processing Speech & Audio

Oct 20, 2025

Tsinghua AIOct 20, 2025

MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems

LLMs still struggle to learn effectively from user feedback during service, as revealed by a new benchmark spanning multiple domains and languages.

Qingyao Ai, Yichen Tang, Changyue Wang +311

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Aug 21, 2025

Tsinghua AIAug 21, 2025·also Fudan

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

LLMs still struggle to synthesize coherent scientific surveys, as evidenced by a new benchmark revealing significant performance gaps even with advanced agentic frameworks.

Weihang Su, Anzhe Xie, Qingyao Ai +5

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Search

Weihang Su

Research focus

Frequent co-authors

Papers (4)