Apr 21, 2026arXiv:2604.19072

S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection

Hong Chen, Yingjie Wang, Tieliang Gong, Bin Gu

AI Summary

This paper introduces a Semi-Supervised Meta Additive Model (S$^2$MAM) to improve semi-supervised learning by addressing the sensitivity of Laplacian regularization to noisy variables and similarity metric selection. S$^2$MAM uses a bilevel optimization to simultaneously identify informative variables and update the similarity matrix used in manifold regularization. The authors provide theoretical convergence and generalization guarantees and demonstrate empirically that S$^2$MAM achieves robust performance and interpretability across a range of datasets with varying levels of corruption.

Key Contribution

Escaping the curse of noisy data in semi-supervised learning: S$^2$MAM adaptively selects features and tunes similarity metrics, leading to more robust and interpretable models.

Abstract

Semi-supervised learning with manifold regularization is a classical framework for jointly learning from both labeled and unlabeled data, where the key requirement is that the support of the unknown marginal distribution has the geometric structure of a Riemannian manifold. Typically, the Laplace-Beltrami operator-based manifold regularization can be approximated empirically by the Laplacian regularization associated with the entire training data and its corresponding graph Laplacian matrix. However, the graph Laplacian matrix depends heavily on the prespecified similarity metric and may lead to inappropriate penalties when dealing with redundant or noisy input variables. To address the above issues, this paper proposes a new \textit{Semi-Supervised Meta Additive Model (S$^2$MAM) based on a bilevel optimization scheme that automatically identifies informative variables, updates the similarity matrix, and simultaneously achieves interpretable predictions. Theoretical guarantees are provided for S$^2$MAM, including the computing convergence and the statistical generalization bound. Experimental assessments across 4 synthetic and 12 real-world datasets, with varying levels and categories of corruption, validate the robustness and interpretability of the proposed approach.

Data Curation & Synthetic Data Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection

Related Papers