Feb 26, 2026arXiv:2602.23341

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Alkis Kalavasis, Alkis Kalavasis, Anay Mehrotra, Anay Mehrotra, Manolis Zampetakis, M. Zampetakis, Felix Zhou, Ziyu Zhu, Ziyu Zhu

AI Summary

This paper studies the problem of Gaussian mean estimation from coarse data, where instead of observing the exact sample, the learner observes a set containing the sample. The authors provide a characterization of when the mean is identifiable under convex partitions, resolving an open question from Fotakis et al. (2021). Furthermore, they demonstrate that computationally efficient estimation is possible under identifiability and convex partitions, addressing another open question from the same prior work.

Key Contribution

Solved: the precise conditions for efficiently estimating a Gaussian mean from coarse, partitioned data, closing a key gap in our understanding of learning from limited information.

Abstract

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs naturally through measurement rounding, sensor limitations, and lag in economic systems. We study Gaussian mean estimation from coarse data, where each true sample $x$ is drawn from a $d$-dimensional Gaussian distribution with identity covariance, but is revealed only through the set of a partition containing $x$. When the coarse samples, roughly speaking, have ``low''information, the mean cannot be uniquely recovered from observed samples (i.e., the problem is not identifiable). Recent work by Fotakis, Kalavasis, Kontonis, and Tzamos [FKKT21] established that sample-efficient mean estimation is possible when the unknown mean is identifiable and the partition consists of only convex sets. Moreover, they showed that without convexity, mean estimation becomes NP-hard. However, two fundamental questions remained open: (1) When is the mean identifiable under convex partitions? (2) Is computationally efficient estimation possible under identifiability and convex partitions? This work resolves both questions. [...]

Data Curation & Synthetic Data Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References54

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Related Papers