You are here

Data Storage

Research Data Store Program

Researchers can store their research data on Intersect's Research Data Store node. The Intersect node will provide members with direct access to up to 50 PBytes of research data storage capacity by 2015. 

The Intersect RDSI node is now accepting nominations of research data collections for storage on the node. For data to be eligible, it must be assessed as "valuable for future research" by Intersect's Storage Allocation Committee. We encourage all researchers to nominate research data that is valuable for your future research - it need not currently be in your possession.  The application form can be found here.

Please note that RDSI is open to non-university researchers/ facilities. Enquiries can be made to enquiries@intersect.org.au

Our staff are working with researchers to understand and implement storage, access methods and authentication. Data held with Intersect will be readily and quickly accessible on both the NeCTAR research cloud and Intersect's HPC facilities. 

What follows is a summary. A more detailed status update can be read here.

Background

Intended to transform the way in which research data collections are stored and accessed, the Research Data Storage Infrastructure (RDSI) project aims to provide large, safe and cost-effective data storage of Australian research data. The RDSI project announced the first of its nodes in June 2012, the Intersect Research Data Store node among them.

RDSI is an Education Investment Fund (EIF) Super Science project with a budget of $50m over three years with which to establish a number of data centre nodes for the storage of research data sets.

"The RDSI project aims to allow researchers to use and manipulate significant collections of data that were previously either unavailable or difficult to access. It will do this by investing in a small number of highly scalable nodes, such as Intersect, to enable them to store collections reliably and provide a consistent interface to the data. It is expected that there will be approximately 100PB of storage supported by RDSI."

Dr Nick Tate
Director, Research Data Storage Infrastructure (RDSI) Project

The Research Data Storage Infrastructure (RDSI) Project, an initiative of the Department of Industry, Innovation, Science, Research and Tertiary Education, is funded from the Education Investment Fund under the Super Science (Future Industries) initiative. 

The ReDS programme is part of the RDSI project and refers to storage funded through the RDSI program, through which collections of value to future research can be stored. Collections obtain storage through a merit-based process.

Intersect has received $1.5m from RDSI to establish the node and will be eligible to access further funding and storage under the RDSI Research Data Stores (ReDS) program. The NSW Government has provided an additional $1m towards this research infrastructure via the Science Leveraging Fund.

What are the benefits?

There are many potential benefits to using the research data store:

  • The data stored will be safe. Data will be stored in a certified tier 3 facility, subject to regular backup and replication.
  • The data stored will be secure. Data will be stored behind security and access protocols designed in conjunction with dataset custodians.
  • The data stored will be accessible. Data will be stored in a facility with very high speed access to AARNET, the NeCTAR research cloud, and high-performance computing facilities.
  • The data stored will be shareable. Data can be made accessible, at researchers' discretion, via file sharing, databases or applications.
  • The data store will be robust. The RDSI project includes provisioning for hardware refresh, ensuring the long-term sustainability of the data.
  • The data stored will incur no cost to researchers. The Intersect RDSI node is funded by the Federal Government, the State Government and sustained by participating universities.

 

The process going forward

Intersect is working with the universities to set up a process for offering data storage to researchers.

The default process is that a researcher will request storage, the university will triage the request, if they deem it suitable for ReDS (they will look at the criteria provided by RDSI) they will forward it to Intersect for submission to the Storage Allocation Committee (SAC). An update on the ReDS Programme can be viewed here.

If the data collection is deemed not to meet the ReDS criteria by the university, they will store the data on their pro rata allocation on the Intersect node. Regardless of whether or not a collection meets ReDS requirements, researchers will receive storage for their data collection on RDSI storage.

What are the criteria?

The RDSI has cited some of the following criteria to establish significance:

  • how hard is the research data to replace?
  • is the research data of cultural significance?
  • was the collection and curation of the data funded through a nationally competitive grant?
  • how much research does the data support? How many researchers cite the data?
  • does the data represent a significant proportion of the research carried out in the relevant discipline?
  • how many researchers make use of (or would like to make use of) the data?
  • how open is the access to the data?

What are the other selection criteria?

The complexity of data collections represents how hard it is for the data to be managed. Factors of complexity include:

  • the size of the data
  • whether the RDSI hosts the primary copy of the data
  • the projected growth of the data 
  • the extent to which the data must be replicated to nodes across the RDSI
  • how researchers will interact with your data (e.g. simple file based access vs. complex access through proprietary software).

What do I do next?

The Intersect node is now open for researchers to nominate research data collections for storage on the Intersect facility, and have that storage funded by the RDSI. A current focus of Intersect activities is the identification of research data collections of value to future research. We would like to discuss the suitability of your research data collections for RDSI. Initially, this will involve a brief meeting to understand the characteristics of your research data. You can take the first step by contacting the eResearch support person at your university or by contacting Intersect directly at joe.thurbon@intersect.org.au We would also especially like to encourage researchers who know of data that they would like to use, but for which they don't have access, to make contact with their university eResearch Analyst. This specifically includes data which is owned by state and federal government departments.