Search papers, labs, and topics across Lattice.
AutoVeriFix+ is introduced, a three-stage framework leveraging LLMs and concolic testing to generate functionally correct Verilog RTL code. The framework uses LLMs to generate Python reference models and initial Verilog candidates, then employs concolic testing to identify and fix state-transition errors using cycle-accurate execution traces. AutoVeriFix+ further optimizes the generated code by identifying and pruning functionally redundant branches based on coverage reports.
LLMs can now generate Verilog RTL code with over 80% functional correctness and 25% less redundant logic, thanks to a novel framework that combines semantic reasoning with state-space exploration.
Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as Python and C++. However, their application to hardware description languages, such as Verilog, is challenging due to the scarcity of high-quality training data. Current approaches to Verilog code generation using LLMs often focus on syntactic correctness, resulting in code with functional errors. To address these challenges, we propose AutoVeriFix+, a novel three-stage framework that integrates high-level semantic reasoning with state-space exploration to enhance functional correctness and design efficiency. In the first stage, an LLM is employed to generate high-level Python reference models that define the intended circuit behavior. In the second stage, another LLM generates initial Verilog RTL candidates and iteratively fixes syntactic errors. In the third stage, we introduce a Concolic testing engine to exercise deep sequential logic and identify corner-case vulnerabilities. With cycle-accurate execution traces and internal register snapshots, AutoVeriFix+ provides the LLM with the causal context necessary to resolve complex state-transition errors. Furthermore, it will generate a coverage report to identify functionally redundant branches, enabling the LLM to perform semantic pruning for area optimization. Experimental results demonstrate that AutoVeriFix+ achieves over 80% functional correctness on rigorous benchmarks, reaching a pass@10 score of 90.2% on the VerilogEval-machine dataset. In addition, it eliminates an average of 25% redundant logic across benchmarks through trace-aware optimization.