Research Drivers and Capabilities
When: Thursday, Oct 22, 2020 – 9AM Pacific Time/4PM UTC
What interfaces and capabilities are needed alongside storage fabric to enable data sharing? This seminar will focus on domain examples from partners using the OSN – from the earth sciences to metro science.
Studies of the whole-earth climate record can require petabytes of data. Fusing, storing and accessing such a dataset for exploration of the science and analysis raise a number of issues for both on-premises and cloud based scientific access.
Donald Petravick is a Senior Project Manager at the National Center for Supercomputing Applications. He is the Principal Investigator for the Dark Energy Survey, and a level 3 manager for the CMB-S4 experiment. He also works on Earth Science, and Multi Messenger Astronomy at NCSA.
Geographically distributed sensor systems that include cameras, microphones, and weather and air quality stations can generate such large volumes of data that fast and efficient analysis is best performed by an embedded computer connected directly to the sensor. The NSF-funded Sage project (sagecontinuum.org) will explore new techniques for applying machine learning algorithms to data from such intelligent sensors and then build reusable software that can run programs within the embedded computer and transmit the results over the network to central computer servers. To enable researchers to develop their own AI applications, Sage will also collect "training data" from the sensors at the edge. In this presentation we will explain how the Sage project will rely on the Open Storage Network to transfer, store and share training data and other user-generated scientifc data products (i.e. training labels, models).
Wolfgang Gerlach is a Senior Software Engineer at the University of Chicago with a joint appointment at Argonne National Laboratory. He holds a PhD in bioinformatics and developed his interest in containerized cloud infrastructures and multi-cloud scientific workflows while working for MG-RAST, a web portal for analysis of metagenomic data. Currently, Wolfgang is working on the Sage project, helping to design the cyberinfrastructure of Sage and implementing core software components.
Disasters, whether natural or human-caused, tend to be defined partly by their short time duration and by the crisis environment a disaster evokes. During a disaster there is often a need to access relevant data and information quickly. Additionally, ephemeral data may be generated of relevance to researchers, but at risk of disappearing or becoming hard to find after the crisis has passed. Even after the crisis researchers and decision-makers need to access relevant data that is difficult to find and work with and they often need to integrate disparate data related to a particular event. This use case describes an approach to address these challenges through leveraging the capabilities offered by the RENCI-supported OSN node and the Hurricane’s Matthew and Harvey National Water Model data for North Carolina. We explore nexus and emerging needs for GeoHealth, Infrastructure, Community Readiness and Scalable Governance methods in Open Source development and research. We provide perspectives on how online training programs and hackweek education models can be a design method for improving our information networks and cyberinfrastructure with broad social and environmental implications.
Christina Bandaragoda Bio:
Christina Bandaragoda joined the University of Washington in 2013. She received her PhD in Civil & Environmental Engineering, Master’s of Business Administration, and Master’s in Biological & Agricultural Engineering from Utah State University, and a BS from Wheaton College. Prior to obtaining her graduate degrees, she worked in the National Park Service and studied International Development with extensive travel in Asia and the Caribbean. She provides hydrologic modeling services to multi-institutional watershed groups, and maintains professional relationships through sponsored projects with agricultural and tribal science communities in the Pacific Northwest.
Chris Lenhardt Bio:
A research scientist working in the Earth Data Science group at the Renaissance Computing Institute (RENCI), UNC-CH, Chris works at the intersection of information science and environmental science and contributes to a range of projects at RENCI. He currently leads an effort to develop a data distribution hub for the North Carolina Per and Polyfluoroalkyl Substances Testing Testing Network (NCPFAST) and has been collaborating with the co-presenter, Christina Bandaragoda at the University of Washington on the hurricane water quality response projects she leads. Chris’ current research interests include studying the sociotechnical aspects of science cyberinfrastructure and convergence research. Prior work includes positions as the Manager of the Distributed Active Archive Center (DAAC) for Biogeochemical Dynamics at Oak Ridge National Laboratory and Deputy Manager at the Socioeconomic Data and Applications DAAC (SEDAC) at the Center for International Earth Science Information (CIESIN), Columbia University. Both DAACs are NASA-funded archives, part of the Earth Observing Data and Information System.
A concept paper based on the event will be posted before Thanksgiving 2020.