Integration metadata from various systems which are internal to an institution

Metadata Store Solutions

CSIRO Research Data Service Multi-Source (incl. Sensor Network) Data Capture via Generic Configurable Automated Deposit


Project Description:

CSIRO’s newly established Research Data Service (RDS) currently has technology that supports a small subset of domain specific data (i.e. water and pulsar data) and also generic self-serve deposit of any type of data from across the organisation. Recent requests from various business areas within CSIRO have highlighted the need to complement this capability with new functionality that enables the ongoing automatic deposit of data from various sources. In order to efficiently respond to this demand in a way that is scalable, the RDS wishes to provide this automation functionality through an enterprise focussed method that makes the addition of future ongoing data deposits a system administration & configuration activity as opposed to a software development activity.
The initial approach will be to produce functionality that supports automated deposit from specified sources such as defined locations on file systems, database management systems or defined drop-box locations. The system will also target specific metadata standards. The initial business area we will partner with to validate the effectiveness of this generic capability is CSIRO’s Sensor and Sensor (SSN) Network Transformational Capability Platform (TCP).
CSIRO operates several sensors and sensor networks that monitor and record conditions about the environment in which they are situated. These sensors and sensor networks produce valuable data that can be used in a variety of contexts. For example, one of the sensor networks that CSIRO operates is situated in Springbrook, QLD, and provides microclimate monitoring that is used in analysis for rainforest regeneration.
At present, the majority of CSIRO’s sensors and sensor networks operate as stand-alone systems but there is a strategic intent to provide a more consistent and generic framework for operating and managing this valuable infrastructure and the data that is produced.
The SSN TCP supports many domains in CSIRO. They aim to tackle sensor networking for many flagships - underpinned by almost all flagship directors being on the board of the SSN TCP. This effort has the potential to benefit multiple domain sciences and multiple flagships at once. Some of the business imperatives for this project include:
Scalable, reusable and generic technologies that will support automated deposit of data and metadata from specific types of sources across the enterprise
Improved productivity of current scientists that use sensor network data.
Improved reliability and timeliness of access in critical projects, e.g. real-time continuous modelling that requires timely and reliable access to sensor network data
A well defined framework for the management of current and future sensor network data

2. Aims and Objectives
The primary business imperative is the continued support of the newly established Research Data Service through efficient, scalable, enterprise focussed technologies.
The intent statement for the project is:
“To further establish generic enterprise data management functionality in the Research Data Service (RDS) system to support scalable and reusable capability for all of CSIRO”
The project intends to address the broad issue raised in Section 1 by producing and delivering software technologies that will extend CSIRO’s newly established RDS system to automate the capture of data and metadata from specific ‘types of locations’ and make it available for discover and re-use.
Benefits to the Researcher/Community/Institution are described through the target outcomes listed in section 3.2.
By making generic configurable automated deposit of data available via the CSIRO RDS system, feeds will be established to publish metadata to ANDS Research Data Australia system. This also sets in place the capability to publish to other external community portals in the future. The combination of these software mechanisms for management and access of a wider array of CSIRO data are aimed at contributing to ANDS target benefits, namely:
research data will be routinely published, enhancing the reputation of Australian researchers;
collaborative research data initiatives of national significance will be facilitated;
it will be easier for international researchers to work with Australian researchers because of the excellence of the Australian research data environment; and
new research will be carried out using existing data more effectively and often, exploiting more completely the value of Australia’s research data.
High Level Software Functionality: