Search papers, labs, and topics across Lattice.
The paper introduces SPARC, a neuro-symbolic framework for automated C unit test generation that addresses the limitations of direct intent-to-code synthesis by LLMs. SPARC uses a four-stage process involving CFG analysis, an Operation Map for grounded reasoning, path-targeted test synthesis, and iterative self-correction with compiler feedback. Experiments on 59 subjects demonstrate that SPARC significantly outperforms vanilla LLM prompting and matches or exceeds symbolic execution in coverage and mutation score, while also improving code readability and maintainability.
LLMs can generate surprisingly effective C unit tests when guided by program structure and constraints, achieving coverage comparable to symbolic execution while producing more readable code.
Automated unit test generation for C remains a formidable challenge due to the semantic gap between high-level program intent and the rigid syntactic constraints of pointer arithmetic and manual memory management. While Large Language Models (LLMs) exhibit strong generative capabilities, direct intent-to-code synthesis frequently suffers from the leap-to-code failure mode, where models prematurely emit code without grounding in program structure, constraints, and semantics. This will result in non-compilable tests, hallucinated function signatures, low branch coverage, and semantically irrelevant assertions that cannot properly capture bugs. We introduce SPARC, a neuro-symbolic, scenario-based framework that bridges this gap through four stages: (1) Control Flow Graph (CFG) analysis, (2) an Operation Map that grounds LLM reasoning in validated utility helpers, (3) Path-targeted test synthesis, and (4) an iterative, self-correction validation loop using compiler and runtime feedback. We evaluate SPARC on 59 real-world and algorithmic subjects, where it outperforms the vanilla prompt generation baseline by 31.36% in line coverage, 26.01% in branch coverage, and 20.78% in mutation score, matching or exceeding the symbolic execution tool KLEE on complex subjects. SPARC retains 94.3% of tests through iterative repair and produces code with significantly higher developer-rated readability and maintainability. By aligning LLM reasoning with program structure, SPARC provides a scalable path for industrial-grade testing of legacy C codebases.