Search papers, labs, and topics across Lattice.
TabKD is introduced, a data-free knowledge distillation method for tabular data that addresses the limitations of existing methods by explicitly modeling feature interactions. It learns adaptive feature bins aligned with teacher decision boundaries and generates synthetic queries to maximize pairwise interaction coverage. Experiments on benchmark datasets demonstrate that TabKD achieves superior student-teacher agreement compared to state-of-the-art baselines, and that interaction coverage strongly correlates with distillation quality.
TabKD achieves state-of-the-art data-free knowledge distillation for tabular data by generating synthetic data that maximizes interaction diversity, a critical factor previously overlooked.
Data-free knowledge distillation enables model compression without original training data, critical for privacy-sensitive tabular domains. However, existing methods does not perform well on tabular data because they do not explicitly address feature interactions, the fundamental way tabular models encode predictive knowledge. We identify interaction diversity, systematic coverage of feature combinations, as an essential requirement for effective tabular distillation. To operationalize this insight, we propose TabKD, which learns adaptive feature bins aligned with teacher decision boundaries, then generates synthetic queries that maximize pairwise interaction coverage. Across 4 benchmark datasets and 4 teacher architectures, TabKD achieves highest student-teacher agreement in 14 out of 16 configurations, outperforming 5 state-of-the-art baselines. We further show that interaction coverage strongly correlates with distillation quality, validating our core hypothesis. Our work establishes interaction-focused exploration as a principled framework for tabular model extraction.