Large amounts of data are being collected about individuals by a variety of organizations: government agencies, banks, hospitals, research institutions, privacy companies, etc. Many of these organizations collect similar data, or data about similar populations. Sharing this data between organizations could bring about many benefits in social, scientific, business, and security domains. For example, by sharing their data, hospitals and small clinics can obtain statistically significant results in cases where the individual datasets are otherwise too small. Unfortunately, much of the collected data is sensitive: it contains personal details about individuals or information that may damage an organization’s reputation and competitiveness. The sharing of data is hence often curbed for ethical, legal, or business reasons.
This project develops a collection of tools that will enable the benefits of data sharing without requiring data owners to share their data. The techniques developed respect principles of data ownership and privacy requirement, and draw on recent scientific developments in privacy, cryptography, machine learning, computational statistics, program verification, and system security. The tools developed in this project will contribute to existing research and business infrastructure, and hence enable new ways to create value in information whose use would otherwise have been restricted. The project supports the development of new curricula material and trains a new generation of researchers and citizens with the multidisciplinary perspectives required to address the complex issues surrounding data privacy.
This project is funded by grant 1565387 from the National Science Foundation to Harvard University and SUNY at Buffalo.