Search papers, labs, and topics across Lattice.
The paper addresses the problem of robust mean estimation under mean-shift contamination, where an adversary can shift a fraction of the samples. They establish sample complexity bounds for general base distributions, demonstrating that consistent estimation is possible under mild spectral conditions on the characteristic function. The authors provide both an algorithm and a matching lower bound, utilizing Fourier analysis and introducing the concept of a Fourier witness.
Mean estimation is possible even when an adversary shifts a fraction of your data, provided the base distribution's characteristic function meets certain spectral conditions.
We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the (potentially multivariate) base distribution, there exists a sample-efficient algorithm that estimates the target mean to any desired accuracy. We complement our upper bound with a qualitatively matching sample complexity lower bound. Our techniques make critical use of Fourier analysis, and in particular introduce the notion of a Fourier witness as an essential ingredient of our upper and lower bounds.