BUPTApr 30, 2026arXiv:2604.27601

SecGoal: A Benchmark for Security Goal Extraction and Formalization from Protocol Documents

Dawei Huang, Hui Li, Haonan Feng, Jingjing Guan, Yueshuang Jiao, Bo Jia

AI Summary

The paper introduces SecGoal, a new expert-annotated benchmark for security goal extraction and formalization from protocol documents, covering 15 widely deployed protocols. They also present AIFG, an AI-assisted framework that decomposes the task into context-aware goal extraction and retrieval-augmented formalization. Experiments show that while large language models (LLMs) struggle with precision in identifying security goals, instruction tuning on SecGoal enables smaller models to achieve significantly higher F1-scores.

Key Contribution

LLMs are surprisingly bad at identifying security goals in protocol documents, but instruction tuning on a new benchmark, SecGoal, closes the gap.

Abstract

Formal verification provides rigorous guarantees for cryptographic security, yet automating the extraction and formalization of security goals from natural language protocol documents remains a major bottleneck, compounded by the scarcity of expert-annotated resources and integrated frameworks bridging unstructured text and symbolic logic. We introduce SecGoal, the first expert-annotated benchmark covering 15 widely deployed protocol documents, including 5G-AKA and TLS 1.3, and AIFG, an AI-assisted framework that decomposes the task into context-aware goal extraction and retrieval-augmented formalization. We conduct a comprehensive evaluation to assess whether contemporary LLMs are ready to automate this pipeline. Our results reveal a pronounced precision-recall imbalance: frontier models, such as Gemini 2.5-Pro, achieve high recall but precision below 15%, frequently misclassifying operational text as security goals. In contrast, instruction tuning on SecGoal enables compact models with 7B/9B parameters to achieve F1-scores above 80%, substantially outperforming larger general-purpose models. Our work establishes a foundational dataset and reproducible baseline for automated formal protocol analysis.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SecGoal: A Benchmark for Security Goal Extraction and Formalization from Protocol Documents

Related Papers