# Computing over Distributed Sensitive Data: Publications

Victor Balcer and Albert Cheu. 4/2020. “Separating Local & Shuffled Differential Privacy via Histograms.” arXiv. Publisher's VersionAbstract
Recent work in differential privacy has highlighted the shuffled model as a promising avenue to compute accurate statistics while keeping raw data in users' hands. We present a protocol in this model that estimates histograms with error independent of the domain size. This implies an arbitrarily large gap in sample complexity between the shuffled and local models. On the other hand, the models are equivalent when we impose the constraints of pure differential privacy and single-message randomizers.
Victor Balcer, Albert Cheu, Matthew Joseph, and Jieming Mao. 8/2020. “Connecting Robust Shuffle Privacy and Pan-Privacy.” arXiv. Publisher's VersionAbstract
In the \emph{shuffle model} of differential privacy, data-holding users send randomized messages to a secure shuffler, the shuffler permutes the messages, and the resulting collection of messages must be differentially private with regard to user data. In the \emph{pan-private} model, an algorithm processes a stream of data while maintaining an internal state that is differentially private with regard to the stream data. We give evidence connecting these two apparently different models.
Our results focus on \emph{robustly} shuffle private protocols, whose privacy guarantees are not greatly affected by malicious users. First, we give robustly shuffle private protocols and upper bounds for counting distinct elements and uniformity testing. Second, we use pan-private lower bounds to prove robustly shuffle private lower bounds for both problems. Focusing on the dependence on the domain size k, we find that robust approximate shuffle privacy and approximate pan-privacy have additive error Θ(k√) for counting distinct elements. For uniformity testing, we give a robust approximate shuffle private protocol with sample complexity Õ (k2/3) and show that an Ω(k2/3) dependence is necessary for any robust pure shuffle private tester. Finally, we show that this connection is useful in both directions: we give a pan-private adaptation of recent work on shuffle private histograms and use it to recover further separations between pan-privacy and interactive local privacy.
Owen Arden, Anitha Gollamudi, Ethan Cecchetti, Stephen Chong, and Andrew C. Myers. 2020. “A Calculus for Flow-Limited Authorization.” Journal of Computer Security.Abstract

Real-world applications routinely make authorization decisions based on dynamic computation. Reasoning about dynamically computed authority is challenging. Integrity of the system might be compromised if attackers can improperly influence the authorizing computation. Confidentiality can also be compromised by authorization, since authorization decisions are often based on sensitive data such as membership lists and passwords. Previous formal models for authorization do not fully address the security implications of permitting trust relationships to change, which limits their ability to reason about authority that derives from dynamic computation. Our goal is a way to construct dynamic authorization mechanisms that do not violate confidentiality or integrity.

We introduce the Flow-Limited Authorization Calculus (FLAC), which is both a simple, expressive model for reasoning about dynamic authorization and also an information flow control language for securely implementing various authorization mechanisms. FLAC combines the insights of two previous models: it extends the Dependency Core Calculus with features made possible by the Flow-Limited Authorization Model. FLAC provides strong end-to-end information security guarantees even for programs that incorporate and implement rich dynamic authorization mechanisms. These guarantees include noninterference and robust declassification, which prevent attackers from influencing information disclosures in unauthorized ways. We prove these security properties formally for all FLAC programs and explore the expressiveness of FLAC with several examples.

Amos Beimel, Shay Moran, Kobbi Nissim, and Uri Stemmer. 2/2019. “Private Center Points and Learning of Halfspaces.” In 32nd Annual Conference on Learning Theory, vol 99: Pp. 1–14. Publisher's VersionAbstract
We present a private learner for halfspaces over an arbitrary finite domain X⊂ℝd with sample complexity mathrmpoly(d,2log∗|X|). The building block for this learner is a differentially private algorithm for locating an approximate center point of m>poly(d,2log∗|X|) points -- a high dimensional generalization of the median function. Our construction establishes a relationship between these two problems that is reminiscent of the relation between the median and learning one-dimensional thresholds [Bun et al.\ FOCS '15]. This relationship suggests that the problem of privately locating a center point may have further applications in the design of differentially private algorithms.
We also provide a lower bound on the sample complexity for privately finding a point in the convex hull. For approximate differential privacy, we show a lower bound of m=Ω(d+log|X|), whereas for pure differential privacy m=Ω(d log|X|).
Benny Applebaum, Amos Beimel, Oriol Farr`as, Oded Nir, and Naty Peter. 5/2019. “Secret-Sharing Schemes for General and Uniform Access Structures.” In Annual International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT 2019). Springer VersionAbstract
A secret-sharing scheme allows some authorized sets of parties to reconstruct a secret; the collection of authorized sets is called the access structure. For over 30 years, it was known that any (monotone) collection of authorized sets can be realized by a secret-sharing scheme whose shares are of size2n−o(n)and until recently no better scheme was known. In a recent breakthrough, Liu and Vaikuntanathan (STOC 2018) have reduced the share size to O(20.994n) Our first contribution is improving the exponent of secret sharing down to 0.892. For the special case of linear secret-sharing schemes, we get an exponent of 0.942 (compared to 0.999 of Liu and Vaikuntanathan).Motivated by the construction of Liu and Vaikuntanathan, we study secret-sharing schemes for uniform access structures. An access structure is k-uniform if all sets of size larger than k are authorized, all sets of size smaller than k are unauthorized, and each set of size k can be either authorized or unauthorized. The construction of Liu and Vaikuntanathan starts from protocols for conditional disclosure of secrets, constructs secret-sharing schemes for uniform access structures from them, and combines these schemes in order to obtain secret-sharing schemes for general access structures. Our second contribution in this paper is constructions of secret-sharing schemes for uniform access structures. We achieve the following results:A secret-sharing scheme for k-uniform access structures for large secrets in which the share size is O(k2) times the size of the secret.A linear secret-sharing scheme for k-uniform access structures for a binary secret in which the share size is~O(2h(k/n)n/2) (where h is the binary entropy function). By counting arguments, this construction is optimal (up to polynomial factors).A secret-sharing scheme for k-uniform access structures for a binary secret in which the share size is 2~O(√klogn). Our third contribution is a construction of ad-hoc PSM protocols, i.e., PSM protocols in which only a subset of the parties will compute a function on their inputs. This result is based on ideas we used in the construction of secret-sharing schemes for k-uniform access structures for a binary secret.
Amos Beimel, Aleksandra Korolova, Kobbi Nissim, Or Sheffet, and Uri Stemmer. 6/2020. “The Power of Synergy in Differential Privacy: Combining a Small Curator with Local Randomizers.” In Information-Theoretic Cryptography (To appear - ITC 2020). ArXiv VersionAbstract

Motivated by the desire to bridge the utility gap between local and trusted curator modelsof differential privacy for practical applications, we initiate the theoretical study of a hybridmodel introduced by “Blender” [Avent et al., USENIX Security ’17], in which differentially private protocols of n agents that work in the local-model are assisted by a differentially private curator that has access to the data of m additional users. We focus on the regime where mn and study the new capabilities of this (m;n)-hybrid model. We show that, despite the fact that the hybrid model adds no significant new capabilities for the basic task of simple hypothesistesting, there are many other tasks (under a wide range of parameters) that can be solved in the hybrid model yet cannot be solved either by the curator or by the local-users separately. Moreover, we exhibit additional tasks where at least one round of interaction between the curator and the local-users is necessary – namely, no hybrid model protocol without such interaction can solve these tasks. Taken together, our results show that the combination of the local model with a small curator can become part of a promising toolkit for designing and implementing differential privacy.

Victor Balcer and Albert Cheu. 4/2020. “Separating Local & Shuffled Differential Privacy via Histograms.” In Information-Theoretic Cryptography (To appear - ITC 2020). ArXiv VersionAbstract

Recent work in differential privacy has highlighted the shuffled model as a promising avenue to compute accurate statistics while keeping raw data in users’ hands. We present a protocol in this model that estimates histograms with error independent of the domain size. This impliesan arbitrarily large gap in sample complexity between the shuffled and local models. On theother hand, we show that the models are equivalent when we impose the constraints of pure differential privacy and single-message randomizers.

Borja Balle, James Bell, Adria Gascon, and Kobbi Nissim. 6/2/2019. “The Privacy Blanket of the Shuffle Model.” In International Cryptology Conference (CRYPTO 2019). Publisher's VersionAbstract
This work studies differential privacy in the context of the recently proposed shuffle model. Unlike in the local model, where the server collecting privatized data from users can track back an input to a specific user, in the shuffle model users submit their privatized inputs to a server anonymously. This setup yields a trust model which sits in between the classical curator and local models for differential privacy. The shuffle model is the core idea in the Encode, Shuffle, Analyze (ESA) model introduced by Bittau et al. (SOPS 2017). Recent work by Cheu et al. (EUROCRYPT 2019) analyzes the differential privacy properties of the shuffle model and shows that in some cases shuffled protocols provide strictly better accuracy than local protocols. Additionally, Erlingsson et al. (SODA 2019) provide a privacy amplification bound quantifying the level of curator differential privacy achieved by the shuffle model in terms of the local differential privacy of the randomizer used by each user. In this context, we make three contributions. First, we provide an optimal single message protocol for summation of real numbers in the shuffle model. Our protocol is very simple and has better accuracy and communication than the protocols for this same problem proposed by Cheu et al. Optimality of this protocol follows from our second contribution, a new lower bound for the accuracy of private protocols for summation of real numbers in the shuffle model. The third contribution is a new amplification bound for analyzing the privacy of protocols in the shuffle model in terms of the privacy provided by the corresponding local randomizer. Our amplification bound generalizes the results by Erlingsson et al. to a wider range of parameters, and provides a whole family of methods to analyze privacy amplification in the shuffle model.
Gian Pietro Farina, Stephen Chong, and Marco Gaboardi. 10/2019. “Relational Symbolic Execution.” In 21st International Symposium on Principles and Practice of Declarative Programming (PPDP 2019). Publisher's VersionAbstract

Symbolic execution is a classical program analysis technique used to show that programs satisfy or violate given specifications. In this work we generalize symbolic execution to support program analysis for relational specifications in the form of relational properties - these are properties about two runs of two programs on related inputs, or about two executions of a single program on related inputs. Relational properties are useful to formalize notions in security and privacy, and to reason about program optimizations. We design a relational symbolic execution engine, named RelSymwhich supports interactive refutation, as well as proving of relational properties for programs written in a language with arrays and for-like loops.

Benny Applebaum, Amos Beimel, Oded Nir, and Naty Peter. 6/2020. “Better Secret-Sharing via Robust Conditional Disclosure of Secrets.” In 52nd ACM Symposium on Theory of Computing (To appear - STOC 2020). ePrint VersionAbstract

A secret-sharing scheme allows to distribute a secret s among n parties such that only some predefined

“authorized” sets of parties can reconstruct the secret, and all other “unauthorized” sets learn

nothing about s. The collection of authorized sets is called the access structure. For over 30 years, it

was known that any (monotone) collection of authorized sets can be realized by a secret-sharing scheme

whose shares are of size 2n-o(n)and until recently no better scheme was known. In a recent breakthrough,

Liu and Vaikuntanathan (STOC 2018) have reduced the share size to 20:994n+o(n), which was

later improved to 20:892n+o(n)by Applebaum et al. (EUROCRYPT 2019).

In this paper we improve the exponent of general secret-sharing down to 0:637. For the special case

of linear secret-sharing schemes, we get an exponent of 0:762 (compared to 0:942 of Applebaum et al.).

As our main building block, we introduce a new robust variant of conditional disclosure of secrets

(robust CDS) that achieves unconditional security even under limited form of re-usability. We show that

the problem of general secret-sharing reduces to robust CDS with sub-exponential overhead and derive

our main result by implementing robust CDS with a non-trivial exponent. The latter construction follows

by presenting a general immunization procedure that turns standard CDS into a robust CDS.

Amos Beimel, Kobbi Nissim, and Mohammad Zaherix. 10/2019. “Exploring Differential Obliviousness.” In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 10/2019. Publisher's VersionAbstract

In a recent paper, Chan et al. [SODA ’19] proposed a relaxation of the notion of (full) memory obliviousness, which was introduced by Goldreich and Ostrovsky [J. ACM ’96] and extensively researched by cryptographers. The new notion, differential obliviousness, requires that any two neighboring inputs exhibit similar memory access patterns, where the similarity requirement is that of differential privacy.

Chan et al. demonstrated that differential obliviousness allows achieving improved efficiency for several algorithmic tasks, including sorting, merging of sorted lists, and range query data structures.

In this work, we continue the exploration of differential obliviousness, focusing on algorithms that do not necessarily examine all their input. This choice is motivated by the fact that the existence of logarithmic overhead ORAM protocols implies that differential obliviousness can yield at most a logarithmic improvement in efficiency for computations that need to examine all their input. In particular, we explore property testing, where we show that differential obliviousness yields an almost linear improvement in overhead in the dense graph model, and at most quadratic improvement in the bounded degree model.

We also explore tasks where a non-oblivious algorithm would need to explore different portions of the input, where the latter would depend on the input itself, and where we show that such a behavior can be maintained under differential obliviousness, but not under full obliviousness. Our examples suggest that there would be benefits in further exploring which class of computational tasks are amenable to differential obliviousness.

Anitha Gollamudi, Owen Arden, and Stephen Chong. 6/2019. “Information Flow Control for Distributed Trusted Execution Environments.” In Computer Security Foundations. Publisher's VersionAbstract
Distributed applications cannot assume that their security policies will be enforced on untrusted hosts. Trusted execution environments (TEEs) combined with cryptographic mechanisms enable execution of known code on an untrusted host and the exchange of confidential and authenticated messages with it. TEEs do not, however, establish the trustworthiness of code executing in a TEE. Thus, developing secure applications using TEEs requires specialized expertise and careful auditing. This paper presents DFLATE, a core security calculus for distributed applications with TEEs. DFLATE offers high-level abstractions that reflect both the guarantees and limitations of the underlying security mechanisms they are based on. The accuracy of these abstractions is exhibited by asymmetry between confidentiality and integrity in our formal results: DFLATE enforces a strong form of noninterference for confidentiality, but only a weak form for integrity. This reflects the asymmetry of the security guarantees of a TEE: a malicious host cannot access secrets in the TEE or modify its contents, but they can suppress or manipulate the sequence of its inputs and outputs. Therefore DFLATE cannot protect against the suppression of high-integrity messages, but when these messages are delivered, their contents cannot have been influenced by an attacker.
Victor Balcer and Salil Vadhan. 9/2019. “Differential Privacy on Finite Computers.” Journal of Privacy and Confidentiality, 9, 2. JPC PageAbstract

Version History:

Also presented at TPDP 2017; preliminary version posted as arXiv:1709.05396 [cs.DS].

2018: Published in Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), pp 43:1-43:21. http://drops.dagstuhl.de/opus/frontdoor.php?source_opus=8353

We consider the problem of designing and analyzing differentially private algorithms that can be implemented on discrete models of computation in strict polynomial time, motivated by known attacks on floating point implementations of real-arithmetic differentially private algorithms (Mironov, CCS 2012) and the potential for timing attacks on expected polynomial-time algorithms. As a case study, we examine the basic problem of approximating the histogram of a categorical dataset over a possibly large data universe $$X$$. The classic Laplace Mechanism (Dwork, McSherry, Nissim, Smith, TCC 2006 and J. Privacy & Condentiality 2017) does not satisfy our requirements, as it is based on real arithmetic, and natural discrete analogues, such as the Geometric Mechanism (Ghosh, Roughgarden, Sundarajan, STOC 2009 and SICOMP 2012), take time at least linear in $$|X|$$, which can be exponential in the bit length of the input.

In this paper, we provide strict polynomial-time discrete algorithms for approximate histograms whose simultaneous accuracy (the maximum error over all bins) matches that of the Laplace Mechanism up to constant factors, while retaining the same (pure) differential privacy guarantee.  One of our algorithms produces a sparse histogram as output. Its "per-bin accuracy" (the error on individual bins) is worse than that of the Laplace Mechanism by a factor of $$\log |X|$$, but we prove a lower bound showing that this is necessary for any algorithm that produces a sparse histogram.  A second algorithm avoids this lower bound, and matches the per-bin accuracy of the Laplace Mechanism, by producing a compact and eciently computable representation of a dense histogram; it is based on an $$(n + 1)$$ - wise independent implementation of an appropriately clamped version of the Discrete Geometric Mechanism.