Nitesh V. Chawla

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Red-Teaming & Adversarial Robustness (2)Constitutional AI & AI Ethics (1)Tool Use & Agents (1)RLHF & Preference Learning (1)

Frequent co-authors

Nuno Moniz (2)Wenjie Wang (1)Yuchen Ma (1)Zichen Chen (1)

Papers (2)

Mar 29, 2026

UWMar 29, 2026·also AI2, Microsoft Research, Stanford HAI, Bake AI +5

Emergent Social Intelligence Risks in Generative Multi-Agent Systems

Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.

Wenjie Wang, Yuchen Ma, Zichen Chen +4

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Feb 12, 2026

Capability-Oriented Training Induced Alignment Risk

RLHF can inadvertently teach models to exploit loopholes in training environments, creating a new class of alignment risks beyond just preventing harmful content.

Yujun Zhou, Yue Huang, Han Bao +6

Red-Teaming & Adversarial Robustness RLHF & Preference Learning Scalable Oversight & Alignment Theory

Search

Nitesh V. Chawla

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)