At a glance

2020 - 2021
2 to 4 months
Completed
Single Digital Presence
State government
Discovery & strategy, Support & optimisation, Technical advisory
Whole of government, Web development, Headless CMS, Content management systems
User needs, Multidisciplinary teams, Tools & systems, Security, Open standards & common platforms, Open source, Testing, Measure performance

Overview

SDP’s challenge

During his daily pandemic briefings, the Victorian Premier directed citizens to government websites for more information. Request volumes on the SDP platformExternal Link increased by around 4000%, to 400,000 requests per minute in the space of 30 seconds.

More about the challenge

SDP’s transformation

Salsa, amazee.ioExternal Link , section.ioExternal Link and DPCExternal Link set up scaling and caching strategies to meet the significant increases in traffic. We preemptively scaled up origin resources and edge resources, and also put in place a caching strategy to achieve a 99% cache offload rate that ensured origin remained operational.

More about the transformation

The outcomes

  • The SDP platformExternal Link can handle rapid ramp ups of traffic
  • New incident response strategy created to meet traffic demands
  • Tailored caching solution that empowers the decoupled architecture of SDP
  • Process improvements and new burst capacity for high traffic events

More about the outcomes

Full case study

Below is more information on the challenge, transformation and outcomes.

SDP’s challenge — massive traffic surges

The COVID-19 pandemic changed the way we consume content. It introduced a transition period where web channels were the primary delivery mechanism for information to citizens. Government officials made announcements, with more detailed information available via website content. This drove huge amounts of traffic to government websites all at once.

This changed the traffic profile of websites from a standard request volume, to volumes that resembled distributed denial of service attacks. Request volumes on the SDP platformExternal Link increased by around 4000%, going from roughly 10,000 requests per minute to 400,000 requests per minute in the space of 30 seconds across the platform.

Below: Request volume ramp up at edge

Below: Origin traffic request volume

SDP’s transformation — refined architecture for improved performance

Salsa, amazee.ioExternal Link , section.ioExternal Link and SDPExternal Link worked together to meet these large traffic spikes. We tackled the issue by both pre-incident planning and then post-incident planning to refine our processes and continually improve.

Pre-incident planning

We made the following refinements to the technology architecture of the SDP platformExternal Link :

  • Auto-scalable workloads

    • Origin clusters define horizontal pod autoscalers that are configured to add more computing resources automatically based on CPU utilisation of the web workload.

    • Edge workloads are deployed to Section’s edge Kubernetes service, which allows for automated scaling events to be triggered and more caching nodes to be added as traffic increases.

  • Caching strategy

    • We worked to define an effective caching strategy for the web properties. The strategy involved using cache tags generated by Drupal and surfacing them via decoupled frontends. This allowed Drupal to issue invalidation requests for all connected devices and enabled content change during high traffic events without service interruption.

    • This strategy allowed us to achieve a 99% cache offload rate, which ensured that origin remained operational during traffic peaks

    • We also set up a regular load testing process

Post-incident strategy

After each event we entered into a retrospective and blameless post-mortem to analyse how the team and systems handled the traffic event and identify areas of improvement that we could make for the next one.

Some of the key activities included:

  • Log monitoring and alerting processes

  • High traffic event response team — due the nature of how the events were triggered we could form a team to review and monitor platform services during events and:

    • Preemptively scale up origin resources

    • Preemptively scale up edge resources

The outcomes — an even more resilient platform

  • Platform that can handle rapid ramp ups of traffic
  • New incident response strategy —used these events as inputs to define a response strategy that involves:
    • Preemptively engaging team members where possible
    • Preemptively scaling workloads (both origin and edge) prior to the event to give more compute capacity and headroom
  • Tailored caching solution that empowers the decoupled architecture of SDP
  • Process improvements and the introduction of burst capacity to monitor and be on stand-by during high traffic events

About SDP

Victoria’s Single Digital PresenceExternal Link is the whole-of-government digital platform run by Victoria's Department of Premier and CabinetExternal Link (DPC). DPC is responsible for several elements of Victoria’s digital engagement, including SDPExternal Link , vic.gov.auExternal Link , data.vic.gov.auExternal Link and engage.vic.gov.auExternal Link .