Search papers, labs, and topics across Lattice.
SpotIt+ is introduced as an open-source tool for evaluating Text-to-SQL systems using bounded equivalence verification, actively searching for database instances that differentiate generated and ground truth SQL queries. A novel constraint-mining pipeline combines rule-based specification mining with LLM-based validation to generate realistic counterexamples. Experiments on the BIRD dataset demonstrate that SpotIt+ with mined constraints identifies more discrepancies than standard test-based evaluation by generating more realistic differentiating databases.
Text-to-SQL evaluation gets a reality check: SpotIt+ uses LLM-validated database constraints to find discrepancies missed by standard testing.
We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the ground truth, SpotIt+ actively searches for database instances that differentiate the two queries. To ensure that the generated counterexamples reflect practically relevant discrepancies, we introduce a constraint-mining pipeline that combines rule-based specification mining over example databases with LLM-based validation. Experimental results on the BIRD dataset show that the mined constraints enable SpotIt+ to generate more realistic differentiating databases, while preserving its ability to efficiently uncover numerous discrepancies between generated and gold SQL queries that are missed by standard test-based evaluation.