Search papers, labs, and topics across Lattice.
This research systematically investigates context-based adversarial attacks on AI code generators, revealing how strategically crafted contextual inputs can lead to significant security vulnerabilities in generated code. Through 2,800 experiments across multiple models, the study quantifies the effectiveness of these attacks, showing a 10.7x increase in vulnerability generation and a 100% success rate for direct instruction attacks on GPT-3.5-Turbo. The findings highlight systemic architectural vulnerabilities across models, while a proposed dual-layer defense framework achieves a high detection rate with minimal false positives, suggesting practical implications for real-time deployment in development environments.
Context-based adversarial attacks can amplify code generation vulnerabilities by over 10 times, exposing critical flaws in popular AI models.
AI-powered code generation systems have transformed software development but introduce critical inference-time security vulnerabilities. This research presents a systematic investigation of context-based adversarial attacks, where strategically crafted contextual inputs, including comments, documentation, variable names, bias large language models toward generating exploitable code. Through 2,800 controlled experiments across CodeT5+, CodeLlama, GPT-3.5-Turbo, and GPT-4, we quantify attack effectiveness and defense mechanisms. Results demonstrate that adversarial conditions increase vulnerability generation 10.7x (from 3.5% to 37.4%), with direct instruction attacks achieving 100% success on GPT-3.5-Turbo. Cross-model transferability reaches 60-100%, indicating systemic architectural vulnerabilities rather than model-specific flaws. Our dual-layer defense framework achieves 89.1% detection rate with 0.3% false positives and 520ms latency, demonstrating practical feasibility for real-time deployment in development environments.