Search papers, labs, and topics across Lattice.
This paper introduces a new task formulation for CAD generation that requires agents to produce fully assembled multi-part STEP files from engineering briefs, validated via finite element analysis (FEA). They found that Codex (GPT-5.5) and Claude Code (Opus-4.7) agents struggle to produce passing artifacts without additional supervision. To address this, they introduce text-only blueprint schemas and a 21-view image renderer as feedback signals, which significantly improve geometric reconstruction and move CAD programs towards physically and structurally sound designs.
LLMs can't engineer: even the best models fail to produce structurally sound CAD designs without iterative refinement and FEA-informed feedback.
Computer-aided design (CAD) is the backbone of modern industrial design, yet learned CAD generators still fall short of real engineering pipelines: they neither iterate like engineers nor evaluate what engineering requires. Prior work has treated CAD generation as two disjoint steps, part synthesis and assembly, where the former is graded by proximity to a gold reference and the latter, when handled at all, is reduced to a separate constraint solving step. In this work, we introduce a more industry-native task formulation that requires a model to produce a fully assembled multi-part STEP file from a free-form engineering brief, which is then validated via finite element analysis (FEA). FEA validation reveals that Codex (GPT-5.5) and Claude Code (Opus-4.7) agents do not produce a single strict-passing artifact in the main first-attempt sweep, with the best configuration meeting only about 20% of typed requirements on average. Moreover, we introduce two additional supervision signals, a novel text-only blueprint schema and a 21-view image renderer that aids the agent's visual inspection, that better align the generation loop with how engineers iterate in practice. On S2O and Fusion360, the same feedback tools improve geometric reconstruction, with GPT-5.5/xhigh rising from 0.444 to 0.592 Box-IoU on S2O and from 0.397 to 0.505 on Fusion360. Together these signals move CAD programs toward artifacts that are not only visually plausible but also checked against physical and structural requirements.