Apr 27, 2026arXiv:2604.25031

Faithful Autoformalization via Roundtrip Verification and Repair

Daneshvar Amrollahi, Jerry Lopez, Clark W. Barrett

AI Summary

The paper introduces a roundtrip verification method to assess the faithfulness of LLM-generated formalizations of natural language, without relying on ground-truth annotations. This method involves formalizing, translating back to natural language, re-formalizing, and then checking logical equivalence using a formal tool. By diagnosing failures in the translation stages and applying targeted repairs, the authors significantly improve formal equivalence rates, achieving 83-85% on a dataset of 150 traffic rules using Claude Opus 4.6 and GPT-5.2.

Key Contribution

LLMs can now formalize natural language with significantly higher fidelity, thanks to a clever roundtrip verification method that self-diagnoses and repairs translation errors.

Abstract

When an LLM formalizes natural language, how do we know the output is faithful? We propose a roundtrip verification approach which does not require ground-truth annotations: formalize a statement, translate the result back to natural language, re-formalize, and use a formal tool to check logical equivalence. When the two formalizations agree, this provides evidence of a faithful formalization. When they disagree, a diagnosis step identifies which translation stage failed, and a targeted repair operator attempts to correct that stage. We evaluate our approach on 150 traffic rules using Claude Opus 4.6 and GPT-5.2. Diagnosis-guided repair raises formal equivalence from 45--61% to 83--85% for both models, outperforming a random-repair baseline. An independent NLI analysis confirms that formal equivalence is correlated with less semantic drift.

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Faithful Autoformalization via Roundtrip Verification and Repair

Related Papers