CASMar 19, 2026arXiv:2603.18606

SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization

Lei Yu, Jingyuan Zhang, Xin Wang, Jia Xu, Li Yang, Changzhi Deng, Jiajia Ma, Fengjun Zhang

AI Summary

The paper introduces SQL-Commenter, a method for generating SQL comments using LLaMA-3.1-8B, addressing the challenge of inadequate comments in complex SQL queries. The approach combines continual pre-training on a large SQL corpus, supervised fine-tuning, and Direct Preference Optimization (DPO) using human feedback to improve the LLM's understanding of SQL semantics. Evaluated on Spider and Bird benchmarks, SQL-Commenter significantly outperforms state-of-the-art baselines, achieving substantial gains in BLEU-4, METEOR, and ROUGE-L scores and demonstrating superior comment quality in human evaluations.

Key Contribution

Forget struggling with cryptic SQL: a new LLM fine-tuned with human preferences generates comments so good, they beat Qwen3-14B by up to 13% on standard metrics.

Abstract

SQL query comprehension is a significant challenge due to complex syntax, diverse join types, and deep nesting. Many queries lack adequate comments, severely hindering code readability, maintainability, and knowledge transfer. Automated SQL comment generation faces two main challenges: limited datasets that inadequately represent complex real-world queries, and Large Language Models'(LLMs) insufficient understanding of SQL-specific semantics. Our empirical analysis shows that even after continual pre-training and supervised fine-tuning, LLMs struggle with complex SQL semantics, yielding inaccurate comments. To address this, we propose SQL-Commenter, an advanced method based on LLaMA-3.1-8B. We first construct a comprehensive dataset of complex SQL queries with expert-verified comments. Next, we perform continual pre-training on a large SQL corpus to enhance the LLM's syntax and semantic understanding, followed by supervised fine-tuning. Finally, we introduce Direct Preference Optimization (DPO) using human feedback. SQL-Commenter utilizes a preference-based loss function to favor preferred outputs, enhancing fine-grained semantic learning and context-dependent quality assessment. Evaluated on the Spider and Bird benchmarks, SQL-Commenter significantly outperforms state-of-the-art baselines. On average, it surpasses the strongest baseline (Qwen3-14B) by 9.29, 4.99, and 13.23 percentage points on BLEU-4, METEOR, and ROUGE-L, respectively. Moreover, human evaluation demonstrates the superior quality of comments generated by SQL-Commenter in terms of correctness, completeness, and naturalness.

Code Generation & Program Synthesis Natural Language Processing RLHF & Preference Learning

Citation Metrics

Citations0

Influential citations0

References70

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization

Related Papers