Feb 23, 2026arXiv:2602.20156

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

David Schmotz, David Schmotz, Luca Beurer-Kellner, Luca Beurer-Kellner, Sahar Abdelnabi, Sahar Abdelnabi, Maksym Andriushchenko

AI Summary

The paper introduces SkillInject, a benchmark to evaluate the vulnerability of LLM agents to prompt injection attacks delivered through third-party skill files. They constructed 202 injection-task pairs, ranging from obvious to subtle attacks, and tested them against frontier LLMs. Results demonstrate a high vulnerability, with up to 80% attack success rates, including data exfiltration and destructive actions, suggesting that current models are susceptible to skill-based prompt injection.

Key Contribution

LLM agents are alarmingly susceptible to "SkillInject" attacks via malicious third-party skill files, achieving up to 80% success in executing harmful instructions like data exfiltration, even with frontier models.

Abstract

LLM agents are evolving rapidly, powered by code execution, tools, and the recently introduced agent skills feature. Skills allow users to extend LLM applications with specialized third-party code, knowledge, and instructions. Although this can extend agent capabilities to new domains, it creates an increasingly complex agent supply chain, offering new surfaces for prompt injection attacks. We identify skill-based prompt injection as a significant threat and introduce SkillInject, a benchmark evaluating the susceptibility of widely-used LLM agents to injections through skill files. SkillInject contains 202 injection-task pairs with attacks ranging from obviously malicious injections to subtle, context-dependent attacks hidden in otherwise legitimate instructions. We evaluate frontier LLMs on SkillInject, measuring both security in terms of harmful instruction avoidance and utility in terms of legitimate instruction compliance. Our results show that today's agents are highly vulnerable with up to 80% attack success rate with frontier models, often executing extremely harmful instructions including data exfiltration, destructive action, and ransomware-like behavior. They furthermore suggest that this problem will not be solved through model scaling or simple input filtering, but that robust agent security will require context-aware authorization frameworks. Our benchmark is available at https://www.skill-inject.com/.

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References44

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

Related Papers