CambridgeApr 21, 2026arXiv:2604.18966

Self-Improving Tabular Language Models via Iterative Group Alignment

Yunbo Long, Tejumade Afonja, A. Brintrup, Mario Fritz

AI Summary

The paper introduces TabGRAA, a novel self-improving framework for tabular data generation using language models. TabGRAA iteratively refines the language model by partitioning generated samples into high- and low-quality groups based on an automated quality signal (e.g., a distinguishability classifier) and then optimizing a group-relative advantage objective. This approach enables the model to learn from its own generated samples, improving fidelity, utility, and privacy without exposing additional real data.

Key Contribution

TabGRAA flips the script on tabular data synthesis, turning static statistical replication into a dynamic, self-improving generation process.

Abstract

While language models have been adapted for tabular data generation, two fundamental limitations remain: (1) static fine-tuning produces models that cannot learn from their own generated samples and adapt to self-correct, and (2) autoregressive objectives preserve local token coherence but neglect global statistical properties, degrading tabular quality. Reinforcement learning offers a potential solution but requires designing reward functions that balance competing objectives -- impractical for tabular data. To fill the gap, we introduce TabGRAA (Tabular Group-Relative Advantage Alignment), the first self-improving framework for tabular data generation via automated feedback. At each iteration, TabGRAA uses an \emph{automated quality signal} -- such as a two-sample distinguishability classifier or a distance-based reward -- to partition newly generated samples into high- and low-quality groups, then optimizes a group-relative advantage objective that reinforces realistic patterns while penalizing artifacts. The specific signal is a modular choice rather than a fixed component of the framework. This establishes a virtuous feedback cycle, where the quality signal is re-computed against newly \emph{generated synthetic} samples at each round; the language model is only fine-tuned on these self-generated signals, so no additional real record is exposed during alignment, mitigating data-leakage risk beyond the initial supervised fine-tuning. Experiments show TabGRAA outperforms existing methods in fidelity, utility, and privacy, while matching or exceeding diffusion-based synthesizers, advancing tabular synthesis from static statistical replication to dynamic, self-improving generation.

Data Curation & Synthetic Data Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References32

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Self-Improving Tabular Language Models via Iterative Group Alignment

Related Papers