Yu Li

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (6)Eval Frameworks & Benchmarks (4)Natural Language Processing (3)Reasoning & Chain-of-Thought (3)

Frequent co-authors

Zhonghao Yang (2)Haoyu Luo (2)Xia Hu (2)Yuxi Ma (1)

Papers (15)

Apr 30, 2026

2d ago

Multi-Level Narrative Evaluation Outperforms Lexical Features for Mental Health

LLM-based narrative evaluation reveals that the *way* people tell their stories is a stronger predictor of mental health than the specific words they use.

Yuxi Ma, Jieming Cui, Muyang Li +6

Eval Frameworks & Benchmarks Natural Language Processing

Apr 27, 2026

5d ago

Don\'t Stop Early: Scalable Enterprise Deep Research with Controlled Information Flow and Evidence-Aware Termination

Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.

Prafulla Kumar Choubey, Kung-Hsiang Huang, P. Venkit +5

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

Apr 16, 2026

Zhonghao Yang +62w ago

Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-CodeX

Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.

Zhonghao Yang, Yu Li, Yanxu Zhu +4

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

2w ago·also Waseda

GenRec: A Preference-Oriented Generative Framework for Large-Scale Recommendation

GenRec proves that generative recommendation can beat existing pipelines in a large-scale industrial setting, achieving nearly 10% gains in key metrics by focusing on preference alignment and efficient sequence encoding.

Yanyan Zou, Junbo Qi, Lunsong Huang +7

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Apr 15, 2026

Team Seedance +1512w ago·also Arizona, ByteDance, Central South University, HKU +5

Seedance 2.0: Advancing Video Generation for World Complexity

Seedance 2.0 leapfrogs existing models by unifying multi-modal inputs (text, image, audio, video) into a single architecture for generating high-quality, longer-duration audio-video content.

Team Seedance, De Chen, Liyang Chen +149

Computer Vision Multimodal Models Speech & Audio

Apr 13, 2026

Ritsumeikan University2w ago·also University of the Ryukyus, ZJU

Dynamic Summary Generation for Interpretable Multimodal Depression Detection

LLMs can generate clinical summaries that not only improve the accuracy of multimodal depression detection but also provide transparent rationales for those predictions.

Shiyu Teng, Jiaqing Liu, Yu Li +5

Interpretability & Mechanistic Interp Multimodal Models Natural Language Processing

Apr 12, 2026

2w ago·also SJTU, USTC

Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs

LLM datasets aren't independent islands: tracing their lineage reveals hidden redundancy, benchmark contamination, and opportunities for more diverse training data.

Yu Li, Xiaoran Shang, Qizhi Pei +8

Data Curation & Synthetic Data Tool Use & Agents

Apr 8, 2026

Yu Li +23w ago·also George Washington University

Reason in Chains, Learn in Trees: Self-Rectification and Grafting for Multi-turn Agent Policy Optimization

By distilling successful and failed reasoning paths into a "Cognitive Tree," T-STAR pinpoints and corrects critical errors in multi-turn reasoning, leading to significant performance gains.

Yu Li, Sizhe Tang, Tian Lan

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Apr 7, 2026

Tsinghua AI3w ago·also PKU, USTC, Zhongguancun Academy

AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery

Forget tedious hyperparameter sweeps; AutoSOTA automates the *entire* research pipeline, discovering 105 new SOTA models across diverse AI tasks in just five hours per paper.

Yu Li, Chenyang Shao, Xinyang Liu +11

Eval Frameworks & Benchmarks Open-Source Models & Weights Training Efficiency & Optimization

Apr 2, 2026

Yu Li +11Apr 2, 2026

ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.

Yu Li, Haoyu Luo, Yuejin Xie +9

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Mar 30, 2026

Da Chang +5Mar 30, 2026

MuonEq: Balancing Before Orthogonalization with Lightweight Equilibration

Row/column normalization *before* orthogonalization can significantly boost convergence and reduce validation perplexity in LLaMA2 pretraining, outperforming the base Muon optimizer.

Da Chang, Qiankun Shi, Lvgang Zhang +3

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Mar 17, 2026

Mar 17, 2026·also Beijing Value Simplex Technology Co. Ltd, University of Electronic Science and Technology, Yangtze Delta Research Institute

FactorEngine: A Program-level Knowledge-Infused Factor Mining Framework for Quantitative Investment

LLMs can now write better quantitative trading algorithms than humans, thanks to a new framework that turns unstructured financial reports into executable code.

Qinhong Lin, Ruitao Feng, Yinglun Feng +5

Code Generation & Program Synthesis Interpretability & Mechanistic Interp

Mar 17, 2026

ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning

ARISE lets language models solve math problems better by learning and reusing successful solution strategies, outperforming existing RL methods, especially on harder, out-of-distribution problems.

Yu Li, Rui Miao, Zhengling Qi +1

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Mar 16, 2026

Tsinghua AIMar 16, 2026·also ByteDance, SJTU

Mixture-of-Depths Attention

LLMs can now scale depth more effectively: a new attention mechanism recovers diluted features in deeper layers, boosting performance with negligible overhead.

Lianghui Zhu, Yuxin Fang, Bencheng Liao +9

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Training Efficiency & Optimization

Mar 10, 2026

Yu Li +2Mar 10, 2026

Trajectory Optimization for Self-Wrap-Aware Cable-Towed Planar Object Manipulation under Implicit Tension Constraints

Self-wrapping cables aren't just a nuisance in robotic manipulation; they're a feature that can be exploited for redirected torque and more efficient object control.

Yu Li, Amin Fakhari, Hamid Sadeghian

Robotics & Embodied AI World Models & Planning

Search

Yu Li

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (15)