Apr 21, 2026arXiv:2604.18976

STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

M. Jung, YongTaek Lim, Chaeyun Kim, Junghwan Kim, Kihyun Kim, Minwoo Kim

AI Summary

This paper introduces STAR-Teaming, a black-box red teaming framework that uses a Multi-Agent System (MAS) with a Strategy-Response Multiplex Network to generate jailbreak prompts. The network structure organizes the search space into semantic communities, improving search efficiency and interpretability of LLM vulnerabilities. Experiments show STAR-Teaming achieves a higher attack success rate than existing methods with lower computational cost, demonstrating the effectiveness of the multiplex network approach.

Key Contribution

Mapping LLM attack strategies onto a multiplex network reveals interpretable vulnerability clusters and dramatically improves red teaming efficiency.

Abstract

While Large Language Models (LLMs) are widely used, they remain susceptible to jailbreak prompts that can elicit harmful or inappropriate responses. This paper introduces STAR-Teaming, a novel black-box framework for automated red teaming that effectively generates such prompts. STAR-Teaming integrates a Multi-Agent System (MAS) with a Strategy-Response Multiplex Network and employs network-driven optimization to sample effective attack strategies. This network-based approach recasts the intractable high-dimensional embedding space into a tractable structure, yielding two key advantages: it enhances the interpretability of the LLM's strategic vulnerabilities, and it streamlines the search for effective strategies by organizing the search space into semantic communities, thereby preventing redundant exploration. Empirical results demonstrate that STAR-Teaming significantly surpasses existing methods, achieving a higher attack success rate (ASR) at a lower computational cost. Extensive experiments validate the effectiveness and explainability of the Multiplex Network. The code is available at https://github.com/selectstar-ai/STAR-Teaming-paper.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References29

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

Related Papers