PKUApr 8, 2026arXiv:2604.06814

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

Dihong Jiang, Ruoqi Cao, Zhiyuan Dang, Li Huang, Qingsong Zhang, Zhiyu Wang, Shihao Piao, Shenggao Zhu, Jianlong Chang, Zhouchen Lin, Qi Tian

AI Summary

OmniTabBench, a new benchmark of 3030 tabular datasets categorized by industry, was introduced to evaluate GBDTs, NNs, and foundation models at scale. Large-scale empirical evaluation on OmniTabBench reveals that no single model family consistently outperforms others across all tabular tasks. Decoupled metafeature analysis identifies dataset properties like size and feature skewness that correlate with the relative performance of different model categories, offering more granular guidance than aggregated metrics.

Key Contribution

The myth of a universally superior model for tabular data is busted by a massive 3030-dataset benchmark, revealing nuanced performance dependencies on dataset characteristics.

Abstract

While traditional tree-based ensemble methods have long dominated tabular tasks, deep neural networks and emerging foundation models have challenged this primacy, yet no consensus exists on a universally superior paradigm. Existing benchmarks typically contain fewer than 100 datasets, raising concerns about evaluation sufficiency and potential selection biases. To address these limitations, we introduce OmniTabBench, the largest tabular benchmark to date, comprising 3030 datasets spanning diverse tasks that are comprehensively collected from diverse sources and categorized by industry using large language models. We conduct an unprecedented large-scale empirical evaluation of state-of-the-art models from all model families on OmniTabBench, confirming the absence of a dominant winner. Furthermore, through a decoupled metafeature analysis, which examines individual properties such as dataset size, feature types, feature and target skewness/kurtosis, we elucidate conditions favoring specific model categories, providing clearer, more actionable guidance than prior compound-metric studies.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

Related Papers