Bridging Privacy Definitions

This working group - composed of privacy experts across disciplines - explores the range of privacy-related definitions from law, computer science, and social science, covering topics such as measures of informational harm, de-identification techniques, formal privacy models such as differential privacy, and privacy standards from laws such as FERPA and HIPAA. The group explores the nature of these definitions, the relationships and gaps between them, and potential methods of bridging the disciplinary divide.

A recent product from this working group is a methodology for extracting a mathematical model from a legal standard such as FERPA. This product can be used to demonstrate that a privacy technology satisfies any given legal standard.

For the 2016-2017 year, we plan to focus on questions related to the broad conceptualization of informational harms, including group harms like discrimination and their relationship to the types of harms addressed by formal privacy definitions like differential privacy. We are also looking to develop methods for setting formal privacy parameters (like the differential privacy parameter epsilon) based on accepted legal, ethical, and social notions.

We are excited to hear from anyone seeking to explore multidisciplinary approaches to privacy. For more information and to join our mailing list, please contact Gabriella Fee at



Salil Vadhan

Salil Vadhan

Vicky Joseph Professor of Computer Science and Applied Mathematics, SEAS, Harvard

Salil Vadhan is the lead PI of Privacy Tools for Sharing Research Data project and the Vicky Joseph Professor of Computer Science and Applied Mathematics.

Kobbi Nissim

Kobbi Nissim

Senior Research Fellow at Harvard University
Professor of Computer Science at Georgetown University
Micah Altman

Micah Altman

Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, MIT
Non-Resident Senior Fellow, The Brookings Institution


Micah Altman, Alexandra Wood, David R. O'Brien, and Urs Gasser. 2016. “Practical Approaches to Big Data Privacy Over Time.” Brussels Privacy Symposium, 2016. Publisher's VersionAbstract

Increasingly, governments and businesses are collecting, analyzing, and sharing detailed information about individuals over long periods of time. Vast quantities of data from new sources and novel methods for large-scale data analysis promise to yield deeper understanding of human characteristics, behavior, and relationships and advance the state of science, public policy, and innovation. At the same time, the collection and use of fine-grained personal data over time is associated with significant risks to individuals, groups, and society at large. In this article, we examine a range of longterm data collections, conducted by researchers in social science, in order to identify the characteristics of these programs that drive their unique sets of risks and benefits. We also examine the practices that have been established by social scientists to protect the privacy of data subjects in light of the challenges presented in long-term studies. We argue that many uses of big data, across academic, government, and industry settings, have characteristics similar to those of traditional long-term research studies. In this article, we discuss the lessons that can be learned from longstanding data management practices in research and potentially applied in the context of newly emerging data sources and uses.

K. Nissim, A Bembenek, A Wood, M Bun, M Gaboardi, U. Gasser, D O'Brien, T Steinke, and S. Vadhan. 2016. “Bridging the Gap between Computer Science and Legal Approaches to Privacy.” In Privacy Law Scholars Conference. Privacy Law Scholars Conference, Washington D.C., 2016.Abstract
The fields of law and computer science incorporate contrasting notions of the privacy risks associated with the analysis and release of statistical data about individuals and groups of individuals. Emerging concepts from the theoretical computer science literature provide formal mathematical models for quantifying and mitigating privacy risks, where the set of risks they take into account is much broader than the privacy risks contemplated by many privacy laws. An example of such a model is differential privacy, which provides a provable guarantee of privacy against a wide range of potential attacks, including types of attacks currently unknown or unforeseen. The subject of much theoretical investigation, new privacy technologies based on formal models have recently been making significant strides towards practical implementation. For these tools to be used with sensitive personal information, it is important to demonstrate that they satisfy relevant legal requirements for privacy protection. However, making such an argument is challenging due to the conceptual gaps between the legal and technical approaches to defining privacy. Notably, information privacy laws are generally subject to interpretation and some degree of flexibility, which creates uncertainty for the implementation of more formal approaches. This Article articulates the gaps between legal and technical approaches to privacy and presents a methodology for rigorously arguing that a technological method for privacy protection satisfies the requirements of a particular law. The proposed methodology has two main components: (i) extraction of a formal mathematical requirement of privacy based on a legal standard found in an information privacy law, and (ii) construction of a rigorous mathematical proof for establishing that a technological privacy solution satisfies the mathematical requirement derived from the law. To handle ambiguities that can lead to different interpretations of a legal standard, the methodology takes a conservative “worst-case” approach and attempts to extract a mathematical requirement that is robust to potential ambiguities. Under this approach, the mathematical proof demonstrates that a technological method satisfies a broad range of reasonable interpretations of a legal standard. The Article demonstrates the application of the proposed methodology with an example bridging between the requirements of the Family Educational Rights and Privacy Act of 1974 and differential privacy.
Effy Vayena, Urs Gasser, Alexandra Wood, David R. O'Brien, and Micah Altman. 2016. “Elements of a New Ethical Framework for Big Data Research.” Washington and Lee Law Review, 3, 72, 31 Mar, 2016. PDFAbstract

merging large-scale data sources hold tremendous potential for new scientific research into human biology, behaviors, and relationships. At the same time, big data research presents privacy and ethical challenges that the current regulatory framework is ill-suited to address. In light of the immense value of large-scale research data, the central question moving forward is not whether such data should be made available for research, but rather how the benefits can be captured in a way that respects fundamental principles of ethics and privacy.

In response, this Essay outlines elements of a new ethical framework for big data research. It argues that oversight should aim to provide universal coverage of human subjects research, regardless of funding source, across all stages of the information lifecycle. New definitions and standards should be developed based on a modern understanding of privacy science and the expectations of research subjects. In addition, researchers and review boards should be encouraged to incorporate systematic risk-benefit assessments and new procedural and technological solutions from the wide range of interventions that are available. Finally, oversight mechanisms and the safeguards implemented should be tailored to the intended uses, benefits, threats, harms, and vulnerabilities associated with a specific research activity.

Development of a new ethical framework with these elements should be the product of a dynamic multistakeholder process that is designed to capture the latest scientific understanding of privacy, analytical methods, available safeguards, community and social norms, and best practices for research ethics as they evolve over time. Such a framework would support big data utilization and help harness the value of big data in a sustainable and trust-building manner.

Alexandra Wood, Edo Airoldi, Micah Altman, Yves-Alexandre de Montjoye, Urs Gasser, David O'Brien, and Salil Vadhan. 2016. “Comments on the Proposed Rules to Revise the Federal Policy for the Protection of Human Subjects”. Online VersionAbstract

Alexandra Wood, Edo Airoldi, Micah Altman, Yves-Alexandre de Montjoye, Urs Gasser, David O'Brien, and Salil Vadhan submitted comments in response to the September 2015 notice of proposed rulemaking to revise the Federal Policy for the Protection of Human Subjects. With the ability to collect and analyze massive quantities of data related to human characteristics, behaviors, and interactions, researchers are increasingly able to explore phenomena in finer detail and with greater confidence. A major challenge for realizing the full potential of these recent advances will be protecting the privacy of human subjects. Drawing from their research findings and a forthcoming article articulating a modern approach to privacy analysis, the authors offer recommendations for updating the Common Rule to reflect recent developments in the scientific understanding of privacy. The suggested revisions ultimately aim to enable wider collection, use, and sharing of research data while providing stronger privacy protection for human subjects.


Specific recommendations include:


  • Incorporating clear and consistent definitions for privacy, confidentiality, and security.

  • Providing similar levels of protection to research activities that pose similar risks.

  • Relying on standards and requirements that recognize the limitations of traditional de-identification techniques, the inadequacy of binary conceptions of “identifiable” and “publicly-available” information, and the significance of inference risks to privacy.

  • Creating a new privacy standard based not on a binary identifiability standard, but on the extent to which attributes that may be revealed or inferred depend on an individual’s data and the potential harm that may result.

  • Requiring investigators to conduct systematic privacy analyses and calibrate their use of privacy and security controls to the specific intended uses and privacy risks at every stage of the information lifecycle.

  • Addressing informational risks using a combination of privacy and security controls rather than relying on a single control such as consent or de-­identification and adopting tiered access models where appropriate.

  • Forming an advisory committee of data privacy experts to help the Secretary of Health and Human Services develop guidance on applying privacy and security controls that are closely matched to the intended uses and privacy risks in specific research activities.


The authors argue that addressing these issues will help lead researchers towards state-of-the-art privacy practices and advance the exciting research opportunities enabled by new data sources and technologies for collecting, analyzing, and sharing data about individuals.

The full comments are also available through
Micah Altman, Alexandra Wood, David O'Brien, Salil Vadhan, and Urs Gasser. 2016. “Towards a Modern Approach to Privacy-Aware Government Data Releases.” Berkeley Technology Law Journal, 3, 30. BTLJ VersionAbstract

Transparency is a fundamental principle of democratic governance. Making government data more widely available promises to enhance organizational transparency, improve government functions, encourage civic engagement, support the evaluation of government decisions, and ensure accountability for public institutions. Furthermore, releases of government data promote growth in the private sector, guiding investment and other commercial decisions, supporting innovation in the technology sectors, and promoting economic development and competition generally. Improving access to government data also advances the state of research and scientific knowledge, changing how researchers approach their fields of study and enabling them to ask new questions and gain better insights into human behaviors. For instance, the increased availability of large-scale datasets is advancing developments in computational social science, a field that is rapidly changing the study of humans, human behavior, and human institutions, and effectively shifting the evidence base of social science. Scientists are also developing methods to mine and model new data sources and big data, and data collected from people and institutions have proven useful in unexpected ways. In the area of public health, Google Flu Trends, which provides a useful and timely supplement to conventional flu tracking methods by analyzing routine Google queries, is a widely publicized example of the unexpected uses of data. These are, of course, just a few examples of the many benefits of open data.

  • 1 of 2
  • »