Mar 2, 2026arXiv:2603.02043

Leave-One-Out Prediction for General Hypothesis Classes

AI Summary

The paper introduces Median of Level-Set Aggregation (MLSA), a novel aggregation procedure based on empirical-risk level sets around the ERM, to provide a general framework for leave-one-out (LOO) prediction. It establishes a multiplicative oracle inequality for the LOO error, demonstrating that the LOO error of the aggregated predictor is bounded by a constant factor of the best possible empirical risk plus a complexity term. The analysis relies on a local level-set growth condition, which is verified for VC classes, finite hypothesis/density classes, and logistic regression, yielding complexity bounds scaling as $O(d \log n)$, $O(\log |H|)$/$O(\log |P|)$, and $O(d \log n)$ respectively.

Key Contribution

Forget PAC-Bayes, this paper delivers sharp, data-dependent generalization bounds for leave-one-out prediction with arbitrary hypothesis classes using a novel level-set aggregation technique.

Abstract

Leave-one-out (LOO) prediction provides a principled, data-dependent measure of generalization, yet guarantees in fully transductive settings remain poorly understood beyond specialized models. We introduce Median of Level-Set Aggregation (MLSA), a general aggregation procedure based on empirical-risk level sets around the ERM. For arbitrary fixed datasets and losses satisfying a mild monotonicity condition, we establish a multiplicative oracle inequality for the LOO error of the form \[ LOO_S(\hat{h}) \;\le\; C \cdot \frac{1}{n} \min_{h\in H} L_S(h) \;+\; \frac{Comp(S,H,\ell)}{n}, \qquad C>1. \] The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as $O(d \log n)$, where $d$ is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as $O(\log |H|)$ and $O(\log |P|)$, respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as $O(d \log n)$ up to problem-dependent factors.

Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Leave-One-Out Prediction for General Hypothesis Classes

Related Papers