Yuejin Xie

Papers on Lattice

Total citations

Topics

h-index

Research focus

Tool Use & Agents (4)Red-Teaming & Adversarial Robustness (2)Code Generation & Program Synthesis (2)Eval Frameworks & Benchmarks (2)Reasoning & Chain-of-Thought (2)

Frequent co-authors

Dongrui Liu (3)Zhonghao Yang (3)Xia Hu (3)Qihan Ren (3)

Papers (5)

May 28, 2026

May 28, 2026·also Shanghai AI Lab

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

AgentDoG 1.5 proves you can achieve GPT-5.4-level agent safety with open-source models trained on just 1k samples, slashing deployment overhead by two orders of magnitude.

Dongrui Liu, Yu Li, Zhonghao Yang +52

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Apr 16, 2026

Zhonghao Yang +4Apr 16, 2026

Benchmarks for Trajectory Safety Evaluation and Diagnosis in OpenClaw and Codex: ATBench-Claw and ATBench-CodeX

Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.

Zhonghao Yang, Yuejin Xie, Haoyu Luo +2

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Apr 8, 2026

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Reasoning SFT doesn't just memorize, it generalizes—but only if you train it long enough, feed it good data, and use a capable model, and even then, reasoning gains come at the cost of safety.

Qihan Ren, Peng Wang, Ruikun Cai +9

Data Curation & Synthetic Data Reasoning & Chain-of-Thought Training Efficiency & Optimization

Apr 2, 2026

Haoyu Luo +10Apr 2, 2026

ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety

Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.

Haoyu Luo, Yuejin Xie, YuQi Fu +8

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Mar 3, 2026

Dadi Guo +8Mar 3, 2026

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

Code-executing agents can autonomously generate new, solvable math problems that are harder than existing ones, offering a scalable solution to the bottleneck of high-quality training data for advanced LLMs.

Dadi Guo, Yuejin Xie, Qingyu Liu +6

Code Generation & Program Synthesis Reasoning & Chain-of-Thought Tool Use & Agents

Search

Yuejin Xie

Research focus

Frequent co-authors

Papers (5)