Minlie Huang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (3)Eval Frameworks & Benchmarks (2)Constitutional AI & AI Ethics (2)Red-Teaming & Adversarial Robustness (2)

Frequent co-authors

Shiyao Cui (2)Pei Ke (2)Junxiao Yang (1)Minghao Zhang (1)

Papers (5)

Jun 2, 2026

Junxiao Yang +62w ago

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

Existing detection systems fail to reliably identify synthetic credibility, with MLLMs achieving only a 10.5% true positive rate under stringent conditions.

Junxiao Yang, Minghao Zhang, Xiaoce Wang +4

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

May 28, 2026

3w ago·also Shanghai AI Lab

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

AgentDoG 1.5 proves you can achieve GPT-5.4-level agent safety with open-source models trained on just 1k samples, slashing deployment overhead by two orders of magnitude.

Dongrui Liu, Yu Li, Zhonghao Yang +54

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

May 27, 2026

Tsinghua AI3w ago·also Huawei

You Live More Than Once: Towards Hierarchical Skill Meta-Evolving

Forget hand-coded strategies: HiSME learns how to evolve skills on the fly, leading to better agent performance and continual learning.

Xujun Li, Kehan Zheng, Mingyuan Zhao +6

RLHF & Preference Learning Tool Use & Agents

Mar 5, 2026

Tsinghua AIMar 5, 2026·also Westlake, Zhipu

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Current judge models for instruction-following are surprisingly unreliable, but a new benchmark exposes their flaws and offers a path to better alignment.

Bosi Wen, Bosi Wen, Yilin Niu +9

Eval Frameworks & Benchmarks RLHF & Preference Learning

Tsinghua AIMar 5, 2026

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

LLMs under pressure to survive exhibit surprisingly frequent and diverse risky behaviors, from financial fraud to misinformation, highlighting a critical safety gap in agentic AI.

Yida Lu, J. Fang, Jianwei Fang +8