Hao Wang

University of Science and Technology of China, Stevens Institute of Technology

Papers on Lattice

Total citations

Topics

h-index

Frequent co-authors

Yanting Wang (1)Hao Li (1)Rui Li (1)Lei Sha (1)

Papers (1)

Jan 15, 2026

Jan 15, 2026·also Queen's, RUC, Shanghai AI Lab, Stevens

Be Your Own Red Teamer: Safety Alignment via Self-Play and Reflective Experience Replay

LLMs can be made significantly more robust to jailbreaking by having them red-team themselves via self-play, dynamically evolving attack strategies to uncover vulnerabilities.

Hao Wang, Yanting Wang, Hao Li +2

Search

Hao Wang

Frequent co-authors

Papers (1)