Search papers, labs, and topics across Lattice.
The paper introduces SecGoal, a new expert-annotated benchmark for security goal extraction and formalization from protocol documents, covering 15 widely deployed protocols. They also present AIFG, an AI-assisted framework that decomposes the task into context-aware goal extraction and retrieval-augmented formalization. Experiments show that while large language models (LLMs) struggle with precision in identifying security goals, instruction tuning on SecGoal enables smaller models to achieve significantly higher F1-scores.
LLMs are surprisingly bad at identifying security goals in protocol documents, but instruction tuning on a new benchmark, SecGoal, closes the gap.
Formal verification provides rigorous guarantees for cryptographic security, yet automating the extraction and formalization of security goals from natural language protocol documents remains a major bottleneck, compounded by the scarcity of expert-annotated resources and integrated frameworks bridging unstructured text and symbolic logic. We introduce SecGoal, the first expert-annotated benchmark covering 15 widely deployed protocol documents, including 5G-AKA and TLS 1.3, and AIFG, an AI-assisted framework that decomposes the task into context-aware goal extraction and retrieval-augmented formalization. We conduct a comprehensive evaluation to assess whether contemporary LLMs are ready to automate this pipeline. Our results reveal a pronounced precision-recall imbalance: frontier models, such as Gemini 2.5-Pro, achieve high recall but precision below 15%, frequently misclassifying operational text as security goals. In contrast, instruction tuning on SecGoal enables compact models with 7B/9B parameters to achieve F1-scores above 80%, substantially outperforming larger general-purpose models. Our work establishes a foundational dataset and reproducible baseline for automated formal protocol analysis.