Mar 16, 2026arXiv:2603.14911

Fine-tuning RoBERTa for CVE-to-CWE Classification: A 125M Parameter Model Competitive with LLMs

AI Summary

A RoBERTa-base model was fine-tuned to classify CVE descriptions into CWE categories using a large-scale dataset of 234,770 AI-refined and agreement-filtered CVE-CWE pairs. The resulting 125M parameter model achieves 87.4% top-1 accuracy and 60.7% Macro F1 on a held-out test set, significantly outperforming a TF-IDF baseline, particularly on rare weakness categories. Remarkably, the model's performance on the CTI-Bench benchmark is statistically equivalent to an 8B parameter model, showcasing strong performance with significantly fewer parameters.

Key Contribution

A fine-tuned RoBERTa model matches the CVE-to-CWE classification accuracy of a model 64x larger, proving that smaller, specialized models can rival LLMs in niche tasks.

Abstract

We present a fine-tuned RoBERTa-base classifier (125M parameters) for mapping Common Vulnerabilities and Exposures (CVE) descriptions to Common Weakness Enumeration (CWE) categories. We construct a large-scale training dataset of 234,770 CVE descriptions with AI-refined CWE labels using Claude Sonnet 4.6, and agreement-filtered evaluation sets where NVD and AI labels agree. On our held-out test set (27,780 samples, 205 CWE classes), the model achieves 87.4% top-1 accuracy and 60.7% Macro F1 -- a +15.5 percentage-point Macro F1 gain over a TF-IDF baseline that already reaches 84.9% top-1, demonstrating the model's advantage on rare weakness categories. On the external CTI-Bench benchmark (NeurIPS 2024), the model achieves 75.6% strict accuracy (95% CI: 72.8-78.2%) -- statistically indistinguishable from Cisco Foundation-Sec-8B-Reasoning (75.3%, 8B parameters) at 64x fewer parameters. We release the dataset, model, and training code.

Data Curation & Synthetic Data Natural Language Processing Open-Source Models & Weights

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Fine-tuning RoBERTa for CVE-to-CWE Classification: A 125M Parameter Model Competitive with LLMs

Related Papers