Search papers, labs, and topics across Lattice.
The paper addresses the limitations of static workflows in Text-to-SQL by proposing a reinforcement learning framework, SquRL, that enables adaptive workflow construction at inference time. They theoretically and empirically demonstrate that dynamic policies outperform static workflows, particularly on complex and out-of-distribution queries, due to the heterogeneity across candidate workflows. SquRL uses a rule-based reward function and incorporates dynamic actor masking and pseudo rewards to improve exploration and training efficiency.
Forget hand-tuning pipelines: a new RL framework lets LLMs dynamically assemble Text-to-SQL workflows at inference time, outperforming even the best static configurations.
Text-to-SQL has recently achieved impressive progress, yet remains difficult to apply effectively in real-world scenarios. This gap stems from the reliance on single static workflows, fundamentally limiting scalability to out-of-distribution and long-tail scenarios. Instead of requiring users to select suitable methods through extensive experimentation, we attempt to enable systems to adaptively construct workflows at inference time. Through theoretical and empirical analysis, we demonstrate that optimal dynamic policies consistently outperform the best static workflow, with performance gains fundamentally driven by heterogeneity across candidate workflows. Motivated by this, we propose SquRL, a reinforcement learning framework that enhances LLMs' reasoning capability in adaptive workflow construction. We design a rule-based reward function and introduce two effective training mechanisms: dynamic actor masking to encourage broader exploration, and pseudo rewards to improve training efficiency. Experiments on widely-used Text-to-SQL benchmarks demonstrate that dynamic workflow construction consistently outperforms the best static workflow methods, with especially pronounced gains on complex and out-of-distribution queries. The codes are available at https://github.com/Satissss/SquRL