Increasing amounts of scientific data emerging from research projects on all scales is spurring research universities to consider solutions to multi-petabyte (PB) storage systems. At the same time, more than 200 US academic institutions have access to high-speed network connectivity for research purposes through NSF CC*NIE awards, and advanced computing resources through XSEDE. The research storage landscape remains highly balkanized, calling for a new type of cyberinfrastructure geared towards facilitating data sharing and transfer.
The Open Storage Network (OSN), funded by NSF and the Schmidt Futures Foundation in 2018, is developing a storage substrate prototype to synergistically benefit from existing investments, and to support a range scientific and scholarly production. The OSN will provide a cyberinfrastructure (CI) service to address specific data storage, transfer, sharing, and access, challenges. OSN will enable and enhance data-driven research collaborations across universities, and make new datasets widely available, including where sharing is currently confined to “sneakernet” approaches.
OSN storage nodes, called pods, are robust and secure petascale appliances designed for simple management and effective service to meet a wide range of specific data storage and transfer needs. Pods will include a common software stack, and run detailed benchmarking aimed at understanding and improving services for our varied use cases. We draw on and utilize existing technologies, including various authentication services (e.g. Globus, iRODS), and the XSEDE allocation process.
The OSN is intended to be an integral part of the national data ecosystem. As such, we seek input into our cyberinfrastructure and policy design, and feedback as we put services in place. Notably, the OSN is linked to the Big Data Innovation Hubs and other data science initiatives involved with a broad range of local, regional, and national-scale research and education.