Search papers, labs, and topics across Lattice.
The paper introduces MultiVul, a multimodal contrastive learning framework that aligns code and comment representations to improve software vulnerability detection. It uses dual similarity learning and consistency regularization, trained on diverse code-text pairs, to enhance robustness and generalization. Experiments across four LLMs on DiverseVul and Devign datasets demonstrate that MultiVul significantly outperforms prompting-based methods and code-only fine-tuning, achieving up to 27.07% F1 improvement.
Software vulnerability detection gets a serious upgrade: aligning code with developer comments boosts F1 scores by up to 27% compared to traditional code-only methods.
Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture developer intent. However, existing vulnerability detection methods mostly rely on single-modality code representations, overlooking the complementary semantic information embedded in comments and thus limiting their generalization across complex code structures and logical relationships. To address this, we propose MultiVul, a multimodal contrastive framework that aligns code and comment representations through dual similarity learning and consistency regularization, augmented with diverse code-text pairs to improve robustness. Experiments on widely adopted DiverseVul and Devign datasets across four large language models (LLMs) (i.e., DeepSeek-Coder-6.7B, Qwen2.5-Coder-7B, StarCoder2-7B, and CodeLlama-7B) show that MultiVul achieves up to 27.07% F1 improvement over prompting-based methods and 13.37% over code-only Fine-Tuning, while maintaining comparable inference efficiency.