Feb 25, 2026arXiv:2602.22282

Differentially Private Truncation of Unbounded Data via Public Second Moments

AI Summary

This paper introduces Public-moment-guided Truncation (PMT), a novel method for applying differential privacy (DP) to unbounded data by leveraging public second-moment information to truncate the data. PMT transforms private data using a public second-moment matrix, enabling a principled truncation based on data dimension and sample size, which results in a well-conditioned second-moment matrix. The authors demonstrate PMT's effectiveness by designing new loss functions and algorithms for penalized and generalized linear regressions, showing improved DP estimation, robustness, and convergence through theoretical analysis and empirical validation on synthetic and real datasets.

Key Contribution

Unlock differential privacy for unbounded data: PMT leverages public second moments to truncate data, boosting accuracy and stability in DP models.

Abstract

Data privacy is important in the AI era, and differential privacy (DP) is one of the golden solutions. However, DP is typically applicable only if data have a bounded underlying distribution. We address this limitation by leveraging second-moment information from a small amount of public data. We propose Public-moment-guided Truncation (PMT), which transforms private data using the public second-moment matrix and applies a principled truncation whose radius depends only on non-private quantities: data dimension and sample size. This transformation yields a well-conditioned second-moment matrix, enabling its inversion with a significantly strengthened ability to resist the DP noise. Furthermore, we demonstrate the applicability of PMT by using penalized and generalized linear regressions. Specifically, we design new loss functions and algorithms, ensuring that solutions in the transformed space can be mapped back to the original domain. We have established improvements in the models' DP estimation through theoretical error bounds, robustness guarantees, and convergence results, attributing the gains to the conditioning effect of PMT. Experiments on synthetic and real datasets confirm that PMT substantially improves the accuracy and stability of DP models.

Constitutional AI & AI Ethics Data Curation & Synthetic Data

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Differentially Private Truncation of Unbounded Data via Public Second Moments

Related Papers