University of Science and TechnologyZJUApr 22, 2026arXiv:2604.20261

Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

Feng Dong, Zhi Zheng, Xiao Han, Wei Chen, Jingqing Ruan, Tong Xu, Yong Chen, Enhong Chen

AI Summary

This paper introduces MALMAS, a multi-agent system that leverages LLMs for automated feature generation from tabular data. MALMAS uses a router agent to activate specialized agents and incorporates a memory module with procedural, feedback, and conceptual memory to iteratively refine feature generation. Experiments on public datasets show MALMAS outperforms state-of-the-art baselines in generating high-quality and diverse features.

Key Contribution

LLMs can generate better features from tabular data when deployed as a multi-agent system with explicit memory of past procedures, feedback, and concepts.

Abstract

Automated feature generation extracts informative features from raw tabular data without manual intervention and is crucial for accurate, generalizable machine learning. Traditional methods rely on predefined operator libraries and cannot leverage task semantics, limiting their ability to produce diverse, high-value features for complex tasks. Recent Large Language Model (LLM)-based approaches introduce richer semantic signals, but still suffer from a restricted feature space due to fixed generation patterns and from the absence of feedback from the learning objective. To address these challenges, we propose a Memory-Augmented LLM-based Multi-Agent System (\textbf{MALMAS}) for automated feature generation. MALMAS decomposes the generation process into agents with distinct responsibilities, and a Router Agent activates an appropriate subset of agents per iteration, further broadening exploration of the feature space. We further integrate a memory module comprising procedural memory, feedback memory, and conceptual memory, enabling iterative refinement that adaptively guides subsequent feature generation and improves feature quality and diversity. Extensive experiments on multiple public datasets against state-of-the-art baselines demonstrate the effectiveness of our approach. The code is available at https://github.com/fxdong24/MALMAS

Code Generation & Program Synthesis Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

Related Papers