FudanHuaweiApr 12, 2026arXiv:2604.10767

VulWeaver: Weaving Broken Semantics for Grounded Vulnerability Detection

Yiheng Cao, Yihao Chen, Xin Hu, Bihuan Chen, Jiayi Deng, Zhuotong Zhou, Susheng Wu, Yiheng Huang, Xueying Du, Xingman Chen, Miaohua Li, Xin Peng

AI Summary

VulWeaver, a novel LLM-based approach, enhances vulnerability detection by weaving broken program semantics into accurate representations and extracting holistic vulnerability context. It constructs an enhanced unified dependency graph (UDG) by integrating deterministic rules with LLM-based semantic inference, and combines explicit program slicing contexts with implicit usage, definition, and declaration information. Experiments on PrimeVul4J and real-world Java projects show VulWeaver significantly outperforms existing methods, achieving a 0.75 F1-score and detecting confirmed vulnerabilities with CVE assignments.

Key Contribution

LLMs can find more real-world code vulnerabilities if you give them a better program representation that combines static analysis with semantic inference, and guide their reasoning with vulnerability-specific meta-prompts.

Abstract

Detecting vulnerabilities in source code remains critical yet challenging, as conventional static analysis tools construct inaccurate program representations, while existing LLM-based approaches often miss essential vulnerability context and lack grounded reasoning. To mitigate these challenges, we introduce VulWeaver, a novel LLM-based approach that weaves broken program semantics into accurate representations and extracts holistic vulnerability context for grounded vulnerability detection. Specifically, VulWeaver first constructs an enhanced unified dependency graph (UDG) by integrating deterministic rules with LLM-based semantic inference to address static analysis inaccuracies. It then extracts holistic vulnerability context by combining explicit contexts from program slicing with implicit contexts, including usage, definition, and declaration information. Finally, VulWeaver employs meta-prompting with vulnerability type specific expert guidelines to steer LLMs through systematic reasoning, aggregated via majority voting for robustness. Extensive experiments on PrimeVul4J dataset have demonstrated that VulWeaver achieves a F1-score of 0.75, outperforming state-of-the-art learning-based, LLM-based, and agent-based baselines by 23%, 15%, and 60% in F1-score, respectively. VulWeaver has also detected 26 true vulnerabilities across 9 realworld Java projects, with 15 confirmed by developers and 5 CVE identifiers assigned. In industrial deployment, VulWeaver identified 40 confirmed vulnerabilities in an internal repository.

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

VulWeaver: Weaving Broken Semantics for Grounded Vulnerability Detection

Related Papers