Search papers, labs, and topics across Lattice.
The paper introduces ToolCAD, a framework that uses LLMs as agents to interact with CAD engines for text-to-CAD generation. It creates an interactive CAD modeling gym for training these agents using hybrid feedback and human supervision, and employs an end-to-end post-training strategy based on online curriculum reinforcement learning to refine CAD Modeling Chain of Thought (CAD-CoT). Results show that ToolCAD enables open-source LLMs to perform comparably to proprietary models in CAD tool usage.
Open-source LLMs can now rival proprietary models in text-to-CAD generation, thanks to a novel reinforcement learning framework that teaches them to expertly wield CAD tools.
Computer-Aided Design (CAD) is an expert-level task that relies on long-horizon reasoning and coherent modeling actions. Large Language Models (LLMs) have shown remarkable advancements in enabling language agents to tackle real-world tasks. Notably, there has been no investigation into how tool-using LLMs optimally interact with CAD engines, hindering the emergence of LLM-based agentic text-to-CAD modeling systems. We propose ToolCAD, a novel agentic CAD framework deploying LLMs as tool-using agents for text-to-CAD generation. Furthermore, we introduce an interactive CAD modeling gym to rollout reasoning and tool-augmented interaction trajectories with the CAD engine, incorporating hybrid feedback and human supervision. Meanwhile, an end-to-end post-training strategy is presented to enable the LLM agent to elicit refined CAD Modeling Chain of Thought (CAD-CoT) and evolve into proficient CAD tool-using agents via online curriculum reinforcement learning. Our findings demonstrate ToolCAD fills the gap in adopting and training open-source LLMs for CAD tool-using agents, enabling them to perform comparably to proprietary models, paving the way for more accessible and robust autonomous text-to-CAD modeling systems.