Feb 23, 2026arXiv:2602.19743

NILE: Formalizing Natural-Language Descriptions of Formal Languages

Tristan Kneisel, Tristan Kneisel, Marko Schmellenkamp, Marko Schmellenkamp, Fabian Vehlken, Fabian Vehlken, Thomas Zeume, Thomas Zeume

AI Summary

The paper introduces NILE, a representation language for formal languages designed to mirror the syntactic structure of natural language descriptions, enabling comparison and explanation of semantic differences between the two. NILE is expressive enough to cover regular languages and fragments of context-free languages common in educational settings. Experiments demonstrate that LLMs can translate natural language descriptions into syntactically similar NILE expressions with high accuracy, facilitating algorithmic explanations for inaccuracies in natural language descriptions.

Key Contribution

LLMs can translate natural language descriptions of formal languages into a new representation, NILE, with high accuracy, enabling automated error explanation in educational settings.

Abstract

This paper explores how natural-language descriptions of formal languages can be compared to their formal representations and how semantic differences can be explained. This is motivated from educational scenarios where learners describe a formal language (presented, e.g., by a finite state automaton, regular expression, pushdown automaton, context-free grammar or in set notation) in natural language, and an educational support system has to (1) judge whether the natural-language description accurately describes the formal language, and to (2) provide explanations why descriptions are not accurate. To address this question, we introduce a representation language for formal languages, Nile, which is designed so that Nile expressions can mirror the syntactic structure of natural-language descriptions of formal languages. Nile is sufficiently expressive to cover a broad variety of formal languages, including all regular languages and fragments of context-free languages typically used in educational contexts. Generating Nile expressions that are syntactically close to natural-language descriptions then allows to provide explanations for inaccuracies in the descriptions algorithmically. In experiments on an educational data set, we show that LLMs can translate natural-language descriptions into equivalent, syntactically close Nile expressions with high accuracy - allowing to algorithmically provide explanations for incorrect natural-language descriptions. Our experiments also show that while natural-language descriptions can also be translated into regular expressions (but not context-free grammars), the expressions are often not syntactically close and thus not suitable for providing explanations.

Code Generation & Program Synthesis Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References23

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

NILE: Formalizing Natural-Language Descriptions of Formal Languages

Related Papers