IBM ResearchTU DarmstadtApr 23, 2026arXiv:2604.21696

Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks

Liane Vogel, Liane Vogel, Kavitha Srinivas, Kavitha Srinivas, N. D'Souza, Niharika D'Souza, Sola Shirai, Sola Shirai, Oktie Hassanzadeh, Oktie Hassanzadeh, Horst Samulowitz, Horst Samulowitz

AI Summary

The paper introduces TEmBed, a benchmark for evaluating tabular embeddings across cell, row, column, and table levels to address the lack of standardized evaluation in tabular foundation models. They evaluate a diverse set of existing tabular representation learning models using TEmBed, revealing that the optimal model choice is task and representation-level dependent. This work provides practical guidance for selecting tabular embeddings and promotes the development of more universal tabular representation models.

Key Contribution

Turns out, the best way to represent tabular data depends heavily on the task at hand, so a one-size-fits-all tabular foundation model may be a mirage.

Abstract

Tabular foundation models aim to learn universal representations of tabular data that transfer across tasks and domains, enabling applications such as table retrieval, semantic search and table-based prediction. Despite the growing number of such models, it remains unclear which approach works best in practice, as existing methods are often evaluated under task-specific settings that make direct comparison difficult. To address this, we introduce TEmBed, the Tabular Embedding Test Bed, a comprehensive benchmark for systematically evaluating tabular embeddings across four representation levels: cell, row, column, and table. Evaluating a diverse set of tabular representation learning models, we show that which model to use depends on the task and representation level. Our results offer practical guidance for selecting tabular embeddings in real-world applications and lay the groundwork for developing more general-purpose tabular representation models.

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References49

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks

Related Papers