
At a glance
$501K-$1M
2020 - 2021
12 to 18 months
Completed
CKAN, LAGOON
State government
Discovery & strategy, Build & migration, Content & training, Hosting & maintenance, Technical advisory, Support & optimisation
Whole of government, Open data
Overview
The Queensland Department of the Environment, Tourism, Science and Innovation’s challenge
The QLD Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) was designed to become the ‘source of truth’ for Queensland science and environmental data’s metadata. However, QESD needed a way to manage updates to the metadata with other data portals such as QSpatial and QLD’s Open Data portal.
The Queensland Department of the Environment, Tourism, Science and Innovation’s transformation
Salsa co-designed and co-developed a custom publishing and export functionality for the QESD CKAN catalogue. This allows the QESD catalogue users to sync metadata changes with other data portals.
The outcomes
QESD catalogue is now the source of truth for Queensland’s science and environmental data’s metadata
Custom publishing functionality in the CKAN catalogue means users can create and update metadata on the Queensland Open Data Portal
Custom functionality exports spatial metadata in ISO 19139 XML format so users can upload metadata changes to QSpatial (or other systems that accept this format)
Full version
DETSI’s challenge — how to sync metadata changes across other catalogues
There is increased demand for data from the Department of the Environment, Tourism, Science and Innovation (DETSI) to analyse Queensland’s environmental challenges and to make informed decisions. However, scientists and researchers within DETSI were finding that some legacy systems and services were limiting their ability to effectively meet the increasing demands on them and their data. Data was generally stored in silos and not easily discoverable or reusable.
The Queensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) catalogue was designed as the whole-of-department approach to metadata management. The QESD catalogue is the metadata component of a consolidated platform to enable better data discovery, more effective data governance and quality controls and promote data sharing within the department.
The QESD catalogue allows science and environmental data to break free from traditional storage limitations, such as being siloed within teams and research areas, to be discoverable by all scientists and researchers in the department; there are more than 350 staff within DETSI’s Science Division.
The QESD catalogue was designed to be the “source of truth” for Queensland science and environmental data’s metadata. But environmental and science data also existed on other data portals. Scientists publish more formalised data (and metadata) products to various locations, such as:
The challenge that DETSI faced was how to manage updates to the metadata within its catalogue and to sync changes with other data portals that also hosted science and environmental metadata.
DETSI’s transformation — publishing and exporting metadata
When theQueensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) went live in July 2021, it became the metadata “source of truth” for more than 600 environmental and science datasets (the number of datasets in the catalogue will increase over time). These datasets were originally harvested from the Queensland Open Data portal and the Queensland Spatial catalogue.
To allow the Queensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) users to sync metadata changes with other data portals, Salsa co-designed and co-developed a custom publishing and export functionality for the QESD CKAN catalogue.
QESD catalogue users have the ability to publish metadata directly to the Queensland Open Data Portal, and update any changes through the publishing feature, keeping the metadata in sync. For spatial datasets, users export metadata as ISO 19139 compliant XML, which is then uploaded to QSpatial to update and sync metadata.
The outcomes — syncing metadata via push (and not pull)
To sync with other CKAN catalogues, a few options are available. The most common and least complicated option is to use CKAN’s harvest functionality. As the QESD catalogue is the “master”, data.qld would harvest from the QESD Catalogue (using an appropriate update frequency).
While this might be best practice in general for CKAN catalogues, there are limitations of this approach, including:
The harvest would generally apply to CKAN to CKAN catalogues, but the QESD catalogue would ultimately need to interface with non-CKAN catalogues also, such as QSpatial, TERN and others.
The metadata would be out-of-sync until the secondary CKAN catalogue harvested any new or changed metadata. If this was seen as a relatively important limitation, then the harvest frequency could be increased.
The problem effectively came down to: do we push the metadata from the QESD catalogue, the “source of truth”, or do we pull the metadata? This was a difficult decision but ultimately, it was decided to “push” metadata changes from the QESD catalogue to other systems. Publishing from CKAN would require complicated custom functionality, but it would offer catalogue users a consistent interface and process for syncing metadata with all external catalogues. And it would allow the metadata owner/editor to control when syncing occurred.
The catalogue’s publish functionality allows users to validate and publish dataset distribution (resource) metadata to external catalogues. Users first validate the metadata against the Queensland open data portal’s schema (ensuring all mandatory fields are provided) before publishing the metadata. During the publishing process, the catalogue performs the required data mapping from the QESD catalogue metadata schema to the data.qld metadata schema. This functionally allows catalogue users to publish metadata directly to the Queensland Open Data Portal, and update any changes through the publishing feature, thereby keeping the metadata in sync.
For spatial datasets, limitations with the QSpatial platform at the time of development prevented QESD catalogue users being able to directly publish and synchronise with QSpatial. One option of updating and syncing metadata was for the catalogue to export spatial metadata records to a specified location and for QSpatial to import the records using the ESRI Geoportal server’s harvest capabilities. However, because of the possibility that QSpatial would upgrade to Esri Geoportal Server version 2, this option was seen as a stop-gap measure. The ideal goal would be to allow direct publishing via APIs. Also, using the asynchronous harvest process would make it cumbersome for administrators to know whether any updates failed, meaning that the two catalogues could be out of sync without active monitoring.
Therefore, a more universal solution was provided, enabling users to export metadata as ISO 19139 compliant XML. Users then upload the XML to QSpatial (or other systems) to create and maintain metadata within QSpatial. This solution relies on business processes to ensure that users maintain datasets on both systems, but errors are immediately known to users who can rectify or escalate the issue. And if / when QSpatial’s underlying technology provides API access, QESD can be upgraded to allow publishing directly to QSpatial. But the functionality to export ISO 19139 compliant XML will still be relevant to users.
About Queensland QLD Department of the Environment, Tourism, Science and Innovation (DETSI)
The Queensland Department of the Environment, Tourism, Science and Innovation is a diverse organisation with responsibilities in areas of environment, parks and forests, science and technology, as well as the arts. Some of DETSI’s key responsibilities include:
- Protecting and managing Queensland’s parks, forests and the Great Barrier Reef
- Enhancing Queensland’s ecosystems
- Preventing, minimising or mitigating the impacts to the environment
- Leading the development of environmental science strategy for government
- Delivering scientific expertise to protect and manage our environment
- Supporting the development of Queensland’s science sector