Search papers, labs, and topics across Lattice.
This paper surveys text adversarial attack and defense techniques in NLP, focusing on methods developed over the past decade. It categorizes attack methods based on their approach, and then reviews defense and detection strategies, highlighting the strengths and weaknesses of each. The survey identifies key challenges and future research directions in improving the robustness of NLP models against adversarial attacks.
The landscape of text adversarial attacks and defenses is now systematically mapped, revealing critical gaps in NLP model robustness.
Text adversarial techniques are a key research area in natural language processing (NLP). With the widespread application of deep learning-driven NLP in sentiment analysis and machine translation, the lack of robustness against adversarial attacks has become increasingly evident. Adversarial samples in the text domain can mislead models and the discrete nature of text distinguishes adversarial techniques in this field from others. To outline recent progress, this paper aims to review relevant studies from the past decade, summarizing advancements in adversarial attack, defense and detection methods. This paper conducts a systematic literature review of text adversarial attacks and defenses. The review begins with an analysis of attack methods, then discusses text adversarial defense and detection methods and finally points out the challenges in both attack and defense. The authors provide a clear classification of attack methods based on specific criteria, followed by an overview of defense and detection techniques, highlighting their respective strengths and limitations. This review systematically classifies and analyzes text adversarial attack, defense and detection techniques and thoroughly explores the challenges and future development directions in this field, providing valuable reference information for researchers in this area.