Search papers, labs, and topics across Lattice.
AV-SQL decomposes complex Text-to-SQL tasks into a pipeline of specialized LLM agents, using agent-generated Common Table Expressions (CTEs) called "agentic views" to encapsulate intermediate query logic and filter relevant schema elements. This approach addresses the challenges of large database schemas and multi-step reasoning, where providing the full schema exceeds the context window and one-shot generation often fails. Experiments on Spider 2.0 show that AV-SQL achieves 70.38% execution accuracy, outperforming state-of-the-art baselines.
LLM agents can conquer complex Text-to-SQL by creating intermediate "agentic views" that break down queries and filter schemas, achieving state-of-the-art results on challenging benchmarks.
Text-to-SQL is the task of translating natural language queries into executable SQL for a given database, enabling non-expert users to access structured data without writing SQL manually. Despite rapid advances driven by large language models (LLMs), existing approaches still struggle with complex queries in real-world settings, where database schemas are large and questions require multi-step reasoning over many interrelated tables. In such cases, providing the full schema often exceeds the context window, while one-shot generation frequently produces non-executable SQL due to syntax errors and incorrect schema linking. To address these challenges, we introduce AV-SQL, a framework that decomposes complex Text-to-SQL into a pipeline of specialized LLM agents. Central to AV-SQL is the concept of agentic views: agent-generated Common Table Expressions (CTEs) that encapsulate intermediate query logic and filter relevant schema elements from large schemas. AV-SQL operates in three stages: (1) a rewriter agent compresses and clarifies the input query; (2) a view generator agent processes schema chunks to produce agentic views; and (3) a planner, generator, and revisor agent collaboratively compose these views into the final SQL query. Extensive experiments show that AV-SQL achieves 70.38% execution accuracy on the challenging Spider 2.0 benchmark, outperforming state-of-the-art baselines, while remaining competitive on standard datasets with 85.59% on Spider, 72.16% on BIRD and 63.78% on KaggleDBQA. Our source code is available at https://github.com/pminhtam/AV-SQL.