Skip to main content
  • Return to Case studies Case studies
  • Home
  • Case studies
  • QLD Government — CKAN services to sync metadata across data catalogues

QLD Government — CKAN services to sync metadata across data catalogues

Delivering a consolidated, whole-of-government data resource for citizens and scientists

Case study summary:  Salsa co-designed and co-developed a custom publishing and export functionality for the Queensland Department of the Environment, Tourism, Science and Innovation Data CKAN catalogue. This lets users sync metadata changes with other data portals.

On this page:

  • Overview
  • Full version
  • About Queensland QLD Department of the Environment, Tourism, Science and Innovation (DETSI)

At a glance

$501K-$1M
2020 - 2021
12 to 18 months
Completed
CKAN, LAGOON
State government
Discovery & strategy, Build & migration, Content & training, Hosting & maintenance, Technical advisory, Support & optimisation
Whole of government, Open data

Overview

The Queensland Department of the Environment, Tourism, Science and Innovation’s challenge

The QLD Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) was designed to become the ‘source of truth’ for Queensland science and environmental data’s metadata. However, QESD needed a way to manage updates to the metadata with other data portals such as QSpatial and QLD’s Open Data portal.

The Queensland Department of the Environment, Tourism, Science and Innovation’s transformation

Salsa co-designed and co-developed a custom publishing and export functionality for the QESD CKAN catalogue. This allows the QESD catalogue users to sync metadata changes with other data portals.

The outcomes

  • QESD catalogue is now the source of truth for Queensland’s science and environmental data’s metadata

  • Custom publishing functionality in the CKAN catalogue means users can create and update metadata on the Queensland Open Data Portal

  • Custom functionality exports spatial metadata in ISO 19139 XML format so users can upload metadata changes to QSpatial (or other systems that accept this format)

Full version

DETSI’s challenge — how to sync metadata changes across other catalogues

There is increased demand for data from the Department of the Environment, Tourism, Science and Innovation (DETSI) to analyse Queensland’s environmental challenges and to make informed decisions. However, scientists and researchers within DETSI were finding that some legacy systems and services were limiting their ability to effectively meet the increasing demands on them and their data. Data was generally stored in silos and not easily discoverable or reusable.

The Queensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) catalogue was designed as the whole-of-department approach to metadata management. The QESD catalogue is the metadata component of a consolidated platform to enable better data discovery, more effective data governance and quality controls and promote data sharing within the department.

The QESD catalogue allows science and environmental data to break free from traditional storage limitations, such as being siloed within teams and research areas, to be discoverable by all scientists and researchers in the department; there are more than 350 staff within DETSI’s Science Division.

The QESD catalogue was designed to be the “source of truth” for Queensland science and environmental data’s metadata. But environmental and science data also existed on other data portals. Scientists publish more formalised data (and metadata) products to various locations, such as:

  • Queensland Open Data PortalExternal Link
  • Queensland Publication PortalExternal Link
  • TERNExternal Link
  • QSpatialExternal Link

The challenge that DETSI faced was how to manage updates to the metadata within its catalogue and to sync changes with other data portals that also hosted science and environmental metadata.

DETSI’s transformation — publishing and exporting metadata

When theQueensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) went live in July 2021, it became the metadata “source of truth” for more than 600 environmental and science datasets (the number of datasets in the catalogue will increase over time). These datasets were originally harvested from the Queensland Open Data portal and the Queensland Spatial catalogue.

To allow the Queensland Department of the Environment, Tourism, Science and Innovation data catalogue (QESD) users to sync metadata changes with other data portals, Salsa co-designed and co-developed a custom publishing and export functionality for the QESD CKAN catalogue.

QESD catalogue users have the ability to publish metadata directly to the Queensland Open Data Portal, and update any changes through the publishing feature, keeping the metadata in sync. For spatial datasets, users export metadata as ISO 19139 compliant XML, which is then uploaded to QSpatial to update and sync metadata.

The outcomes — syncing metadata via push (and not pull)

To sync with other CKAN catalogues, a few options are available. The most common and least complicated option is to use CKAN’s harvest functionality. As the QESD catalogue is the “master”, data.qld would harvest from the QESD Catalogue (using an appropriate update frequency).

While this might be best practice in general for CKAN catalogues, there are limitations of this approach, including:

  • The harvest would generally apply to CKAN to CKAN catalogues, but the QESD catalogue would ultimately need to interface with non-CKAN catalogues also, such as QSpatial, TERN and others.

  • The metadata would be out-of-sync until the secondary CKAN catalogue harvested any new or changed metadata. If this was seen as a relatively important limitation, then the harvest frequency could be increased.

The problem effectively came down to: do we push the metadata from the QESD catalogue, the “source of truth”, or do we pull the metadata? This was a difficult decision but ultimately, it was decided to “push” metadata changes from the QESD catalogue to other systems. Publishing from CKAN would require complicated custom functionality, but it would offer catalogue users a consistent interface and process for syncing metadata with all external catalogues. And it would allow the metadata owner/editor to control when syncing occurred.

The catalogue’s publish functionality allows users to validate and publish dataset distribution (resource) metadata to external catalogues. Users first validate the metadata against the Queensland open data portal’s schema (ensuring all mandatory fields are provided) before publishing the metadata. During the publishing process, the catalogue performs the required data mapping from the QESD catalogue metadata schema to the data.qld metadata schema. This functionally allows catalogue users to publish metadata directly to the Queensland Open Data Portal, and update any changes through the publishing feature, thereby keeping the metadata in sync.

For spatial datasets, limitations with the QSpatial platform at the time of development prevented QESD catalogue users being able to directly publish and synchronise with QSpatial. One option of updating and syncing metadata was for the catalogue to export spatial metadata records to a specified location and for QSpatial to import the records using the ESRI Geoportal server’s harvest capabilities. However, because of the possibility that QSpatial would upgrade to Esri Geoportal Server version 2, this option was seen as a stop-gap measure. The ideal goal would be to allow direct publishing via APIs. Also, using the asynchronous harvest process would make it cumbersome for administrators to know whether any updates failed, meaning that the two catalogues could be out of sync without active monitoring.

Therefore, a more universal solution was provided, enabling users to export metadata as ISO 19139 compliant XML. Users then upload the XML to QSpatial (or other systems) to create and maintain metadata within QSpatial. This solution relies on business processes to ensure that users maintain datasets on both systems, but errors are immediately known to users who can rectify or escalate the issue. And if / when QSpatial’s underlying technology provides API access, QESD can be upgraded to allow publishing directly to QSpatial. But the functionality to export ISO 19139 compliant XML will still be relevant to users.

Lagoon AU Public Cluster

About Queensland QLD Department of the Environment, Tourism, Science and Innovation (DETSI)

The Queensland Department of the Environment, Tourism, Science and Innovation is a diverse organisation with responsibilities in areas of environment, parks and forests, science and technology, as well as the arts. Some of DETSI’s key responsibilities include:

  • Protecting and managing Queensland’s parks, forests and the Great Barrier Reef
  • Enhancing Queensland’s ecosystems
  • Preventing, minimising or mitigating the impacts to the environment
  • Leading the development of environmental science strategy for government
  • Delivering scientific expertise to protect and manage our environment
  • Supporting the development of Queensland’s science sector
  • Case studies
  • Whole of government
  • Open data

Share this page

  • Twitter , opens a new window
  • Facebook , opens a new window
  • LinkedIn , opens a new window
  • Services
  • Practices
  • Case studies
  • Insights
  • Webinars
  • Team
  • Contact

Connect with us

  • Twitter
  • LinkedIn
  • Facebook
Aboriginal flag Torres Strait Islander flag
Salsa is located on the traditional lands of the Wurundjeri-willam people of the Kulin Nation. We pay our respects to Elders both past and present and recognise Aboriginal and Torres Strait Islander people as the Traditional Custodians of the land.
  • Privacy policy
  • Accessibility
AWS PartnerVictoria – logo – Victoria Government - home