At a glance

2020
4 to 6 weeks
In progress
CKAN
State government
Discovery & strategy, Build & migration, Technical advisory
GovTech, Whole of government, Open data
Multidisciplinary teams, Tools & systems, Open standards & common platforms, Open source, Digital adoption

DES’s challenge

Datasets within S&T have traditionally been stored in silos either in managed systems, bespoke storage solutions or general purpose storage. Therefore, in some cases these datasets are not easily discoverable to a wider audience of scientists.

By adopting a whole-of-department approach to data management, S&T’s vision is for datasets to be discoverable and reusable, both internally and externally to provide more scientific value from these data assets. A data catalogue will provide a consolidation and discovery platform to enable more effective searching, data governance and quality, data awareness and sharing within the department.

The outcomes

  • A single consolidated catalogue of all S&T science datasets

  • A powerful search for dataset discovery

  • Enabling sharing and reuse of data assets

DES’s challenge — provide more scientific value from data assets

Queensland’s DES has an important role in the management of complex environmental matters such as protecting the Great Barrier Reef and safeguarding threatened species. Traditionally, however, S&T science data has been siloed and not easily discoverable by staff outside of those directly involved in the data’s management. This has led to a number of challenges within S&T such as:

  • Little to no visibility of other science teams’ data assets — lost opportunity to share and reuse

  • Difficulty finding and accessing relevant datasets — inefficient discovery

  • Datasets in different formats — inefficient interoperability for reuse

  • Very large datasets that are impractical to distribute – inhibits sharing

  • Inconsistent or siloed metadata — inhibits dataset discovery

S&T identified these shortcomings and lost opportunities to provide more scientific value from its data assets. Salsa Digital is playing a role in addressing these opportunities by co-designing and co-building a data catalogue.

DES’s transformation — a more connected and collaborative data management portal

Salsa and S&T are implementing a data catalogue using the CKAN open source data portal framework to make data more discoverable and accessible within the department. In particular the CKAN data catalogue will:

  • Facilitate better data discoverability, management, sharing and reuse of scientific data including spatial, tabular, time-series and imagery

  • Apply metadata standards, controlled vocabularies and governance for consistent description of dataset assets including quality

The outcomes — project thus far

Salsa and S&T have structured the project into distinct stages, delivered applying an Agile methodology. The S&T data catalogue has been partitioned into requirements that are must-have, should-have and could-have. A discovery phase agreed the must-have requirements for the data catalogue. These include:

  • Ability to record and view metadata describing datasets including access and data quality

  • Application of standard metadata schemas within the catalogue

  • Application of — and validation against — standard controlled metadata vocabularies

  • Validation of metadata against mandatory and optional fields

  • Functionality to search the catalogue including spatial search

  • Ability for users to view data lineage

  • Ability to assign and maintain metadata against different versions of datasets

  • Functionality to link datasets to series or collections

  • Ability to audit, review and curate the catalogue

  • Notifications to users to review and update metadata records

  • Harvest of metadata from existing data catalogues and data sources

  • Control permissions and security access to the catalogue based on DES user roles

Salsa is delivering on these requirements using a series of agile sprints that build out a version of the S&T data catalogue for launch to science users with baseline functionality.

A discrete stage 2 of the project aims to populate the data catalogue with key datasets. Over 500 science datasets have been identified for harvest and registration within the data catalogue. The registration of these datasets will coincide with the broad adoption of the catalogue and rollout of training to catalogue users.

About QLD DES

The Queensland Department of Science and Environment is a diverse organisation with responsibilities in areas of environment, parks and forests, science and technology, as well as the arts. Some of DES’s key responsibilities include:

  • Protecting and managing Queensland’s parks, forests and the Great Barrier Reef

  • Enhancing Queensland’s ecosystems

  • Preventing, minimising or mitigating the impacts to the environment

  • Leading the development of environmental science strategy for government

  • Delivering scientific expertise to protect and manage our environment

  • Supporting the development of Queensland’s science sector