Thursday, October 20, 2016
Maxwell Dworkin G115
Abstract: Since modern science began, data have been a critical part of the scientific enterprise, not only for conducting science but also for communicating and validating scientific results. From the beginning, it was clear that for the scientific community to continually verify scientific results, the underlying data had to be made accessible. But that has not been, and is still not, always the case. In recent years however, public data repositories have grown significantly, making many research data sets easily accessible to others. The Dataverse project, an open-source software for building repositories to share research data (such as the Harvard Dataverse), has played an important role in making this happen, by giving incentives to researchers to share their own data. In this talk, I will discuss how we got here, and introduce current projects that extend Dataverse to address the next challenges in sharing research data. In particular, I'll present a project that, through integrating Dataverse with remote computing sites, makes large-scale structural biology data widely accessible and helps validate previous results.
Speaker: Mercè Crosas is the Chief Data Science and Technology Officer at the Institute for Quantitative Social Science (IQSS) at Harvard University. She has more than 10 years of experience leading the Dataverse project and more than 15 years of experience building data management and analysis systems in industry and academia. She is part of numerous committees and working groups focus on research data management, data standards, and research best practices. Crosas is currently co-PI of the Dataverse Project, with IQSS faculty director Gary King, and supervises the Zelig project for statistical analysis, Consilience for text analysis, the Data Science Services and Data Curation Services at IQSS. She is collaborating with the Harvard Privacy Tools project led by Salil Vadhan (Harvard), the Provenance project with Margo Seltzer (Harvard), the Structural Biology Grid Data project with Piotrek Sliz (Harvard Medical School), and with the Massachusetts Open Cloud (BU), among others.
Crosas holds a Ph.D. in Astrophysics and a B.S. in Physics. More at mercecrosas.com and @mercecrosas.
Host: Margo Seltzer