The Open Storage Network (OSN) supports science and scholarly research that requires data storage and transfer at scale, by simplifying and accelerating access to data that is in active use by ongoing research projects. The OSN places particular emphasis on large data (hundreds of terabytes) sets that are often difficult to share, and long tail data sets that are often difficult to find and access. Deployment of the OSN is a response to the increasing importance of storage as the third component of national cyberinfrastructure, complementing investments in computing and networks. While other uses may emerge over time, the OSN is intended initially to serve two principal needs: (1) facilitate smooth flow of large data sets between data and computing resources such as instruments, synthetic data projects, campus data centers, national supercomputing centers, and cloud providers; and (2) make it easy to expose long tail data sets to the entire scientific community.
The OSN is a functionally and administratively coherent federation of storage systems, referred to as Pods, that reside at independent sites. The OSN design leverages well defined standards and APIs that accommodate local variation while ensuring uniform global behavior. This approach is intended to enable scaling to hundreds of pods with aggregate raw capacity of hundreds of petabytes.
Active use: Data in Active Use supports new discovery and understanding within the science and education community. Data in Active use is not archival, and it is not changing in real time (though snapshots from a live data stream would constitute an active data set). Over time, the OSN may quantify parts of this definition with metrics such as number of downloads or citations.
Pod: The appliance (i.e. equipment and software), that provides OSN functionality such as storage and connectivity at a single site.
Project: The community of researchers, curators, and administrators who govern, manage, access and use an OSN allocation or data set.
Allocation: A quantity of storage that is managed, accessed and used by a project. An allocation may be distributed across one or more Pods.
Data Set: A collection of one or more objects stored on the OSN that is managed, accessed, and used by a Project. A data set may be housed within an Allocation, or it may reside for a short time in storage that is used to facilitate smooth flow of large data sets.
Users: An OSN user is a group or individual who retrieves data from the OSN. OSN Users view the OSN as a utility that serves their research and education needs without requiring that they know or understand the implementation details.
Data Managers: A Data Manager serves as the point of contact between a Project and the OSN. A data manager is responsible for content that is stored within the OSN.
Service Partners: Service Partners provide software and services that support access, identity management, provisioning, and other Pod and Federation-level services. The OSN is open to any service partner that can make use of its API.
Infrastructure Administrators and Oversight Teams: An Infrastructure Administrator is responsible for managing OSN Pods and related infrastructure at one or more sites. Oversight teams are responsible for ensuring robust operation of the federated OSN pods.