PCPs and the Hardness of Generating Synthetic Data

Citation:

Jon Ullman and Salil Vadhan. 2011. “PCPs and the Hardness of Generating Synthetic Data.” In Proceedings of the 8th IACR Theory of Cryptography Conference (TCC `11), edited by Yuval Ishai, Lecture Notes on Computer Science, 5978: Pp. 572–587. Providence, RI: Springer-Verlag. Date Presented: 28–30 March. Springer Link
PDF648 KB

Abstract:

Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm A that takes a database D\in ({0,1}^d)^n and outputs a ``synthetic database'' D' all of whose two-way marginals are approximately equal to those of D. (A two-way marginal is the fraction of database rows x\in {0,1}^d with a given pair of values in a given pair of columns.) This answers a question of Barak et al. (PODS `07), who gave an algorithm running in time poly(n,2^d). Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC `09) with PCP-based Levin-reductions from NP search problems to finding approximate solutions to CSPs.

Notes:

Full version posted as {\em ECCC} TR10-017.

Acknowledgements: This paper was supported, in part, by NSF grant CNS-0831289.
Last updated on 04/13/2019