Bootstrap Inference and Differential Privacy: Standard Errors for Free


Thomas Brawner and James Honaker. 7/2018. “Bootstrap Inference and Differential Privacy: Standard Errors for Free.” Summer Meetings of the Society for Political Methodology, Provo, UT.
PrePrint.pdf0 bytes


The bootstrap is a common and powerful statistical tool for numerically computing the standard error of estimators, that is, a calculation of the uncertainty of functions computed on sample data so as to make an inference back to the original population from which the sample was drawn. Understanding uncertainty, and inferential questions, in the context of private data is an increasingly important task within the literature of differential privacy [7, 20, 15]. We show how to construct an implementation of the bootstrap within differential privacy. Most importantly, we show that, for a broad class of functions under zero concentrated differential privacy, the bootstrap can be implemented at no cost. That is, for a given choice of privacy parameter and associated expected error of some query, the bootstrap can be implemented for the exact same privacy guarantee, resulting in the same expected error (or sometimes less) in the desired query, but additionally provide the standard error of that query. In section 2 we provide a brief overview of differential privacy. Then to describe these results on bootstrap inference, in section 3 we describe some foundational results on the aggregation of repeated queries under contrasting privacy and composition definitions. This leads to a tangential result in section 4 on a low-noise Gaussian mechanism for pure differential privacy. Next we provide a brief foundation on the bootstrap algorithm in statistics in section 5, before showing our algorithmic construction of the bootstrap using the mechanisms of differential privacy in section 6. In section 7 we describe how to use the differentially private estimate of the standard error in the construction of confidence intervals and hypothesis tests, and then demonstrate this in section 8 with examples using published Census microdata in the style of privacy sensitive data.
Last updated on 05/02/2022