Ant GroupZJUFeb 26, 2026arXiv:2602.22777

KMLP: A Scalable Hybrid Architecture for Web-Scale Tabular Data Modeling

Mingming Zhang, Mingming Zhang, Pengfei Shi, Pengfei Shi, Zhiqing Xiao, Feng Zhao, Guandong Sun, Yulin Kang, Yulin Kang, Ruizhe Gao, Ningtao Wang, Xing Fu, Weiqiang Wang

AI Summary

The paper introduces KMLP, a hybrid deep learning architecture combining a Kolmogorov-Arnold Network (KAN) front-end for feature transformation with a Gated Multilayer Perceptron (gMLP) backbone for interaction modeling, designed to address scalability challenges in web-scale tabular data modeling. KMLP leverages KANs to automatically learn complex, non-linear feature transformations, mitigating the need for manual feature engineering and handling issues like feature anisotropy and non-stationarity. Experiments on public benchmarks and a large industrial dataset demonstrate that KMLP achieves state-of-the-art performance, particularly excelling at larger scales compared to GBDTs and other baselines.

Key Contribution

Ditch the manual feature engineering: KMLP's hybrid KAN-gMLP architecture automatically learns complex feature transformations and interactions, outperforming GBDTs on web-scale tabular data.

Abstract

Predictive modeling on web-scale tabular data with billions of instances and hundreds of heterogeneous numerical features faces significant scalability challenges. These features exhibit anisotropy, heavy-tailed distributions, and non-stationarity, creating bottlenecks for models like Gradient Boosting Decision Trees and requiring laborious manual feature engineering. We introduce KMLP, a hybrid deep architecture integrating a shallow Kolmogorov-Arnold Network (KAN) front-end with a Gated Multilayer Perceptron (gMLP) backbone. The KAN front-end uses learnable activation functions to automatically model complex non-linear transformations for each feature, while the gMLP backbone captures high-order interactions. Experiments on public benchmarks and an industrial dataset with billions of samples show KMLP achieves state-of-the-art performance, with advantages over baselines like GBDTs increasing at larger scales, validating KMLP as a scalable deep learning paradigm for large-scale web tabular data.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References29

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

KMLP: A Scalable Hybrid Architecture for Web-Scale Tabular Data Modeling

Related Papers