SydneyJun 7, 2026arXiv:2606.08465

An Empirical Comparison of General Context-Free Parsers

Huan Vo, Danushka Liyanage, Hong Jin Kang, Sasha Rubin, Rahul Gopinath

AI Summary

This paper conducts the first comprehensive benchmark of six generalized parsing algorithms—CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR—against deterministic LL(1) and LR(1) baselines across 22 diverse grammars. The findings reveal that the performance cost of using general context-free parsers is significantly lower than previously thought, with the GLR family achieving only a 3x median slowdown compared to LR(1) on deterministic grammars. This positions GLR as the optimal choice for enhancing software engineering tools without sacrificing performance.

Key Contribution

General context-free parsers can achieve comparable performance to deterministic parsers, debunking the myth that expressiveness comes at a prohibitive cost.

Abstract

Parsing underpins a vast range of software engineering tasks, from compilers and static analyzers to language servers and fuzz testing tools. Yet most parsers deployed in practice are deterministic (LL or LR), forcing developers not only to contort their grammars to fit the parser, but to simplify the very languages they design sacrificing expressiveness for the sake of parseability. General context-free parsers eliminate this constraint. Yet, despite decades of algorithmic development, no rigorous head-to-head comparison exists across the major families of parsing algorithms. We present the first unified, controlled benchmark of six generalized parsing algorithms: CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR, plus deterministic LL(1) and LR(1) baselines, all implemented in Rust with shared data structures and parse-tree extraction, and evaluated across 22 grammars ranging from simple expressions to full C++ and Java. Our results show that the cost of generality is lower than widely assumed. On deterministic grammars, the GLR family incurs only a 3x median slowdown over LR(1), with a narrow and predictable variance. GLR is the clear performance winner among generalized parsers and a practical default choice for software engineering tools.

Code Generation & Program Synthesis Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

An Empirical Comparison of General Context-Free Parsers

Related Papers