Skip to main content
Last updated: 2 Oct 2024

GOV.UK content mirrors

A GOV.UK mirror is a static copy of pages and assets hosted on www.gov.uk or assets.publishing.service.gov.uk (or equivalent domains in integration and staging). A mirror includes:

  • HTML pages
  • related assets for those pages (e.g. JavaScript, CSS, images, fonts)
  • other linked assets (or “attachments”) such as CSVs, PDFs etc

Available mirrors

We maintain three mirrors, ranked by priority:

  1. Primary: AWS S3 bucket named govuk-<environment>-mirror in eu-west-2
  2. Secondary: AWS S3 bucket named govuk-<environment>-mirror-replica in eu-west-1 (production only)
  3. Tertiary: Google Cloud Storage (GCS) bucket named govuk-<environment>-mirror

We use multiple mirrors across various AWS regions and GCP to ensure redundancy and increase availability.

When is the GOV.UK mirror used?

If Fastly, our primary CDN, cannot fetch a page from our backend servers (becuase of a timeout or a 5xx error), then Fastly will attempt to serve a page from a mirror in order of priority.

How are the mirrors populated?

Every day the govuk-mirror-sync cronjob crawls the www and assets domains, saves pages and assets to disk and then uploads the files to the primary S3 bucket. The govuk-mirror repository contains the code responsible for crawling and saving pages to disk.

S3 Replication automatically copies any changes from the primary S3 bucket to the secondary S3 bucket. This is configured in govuk-aws.

GCP Storage Transfer Service copies any changes from the primary S3 bucket to the tertiary GCS bucket.

What is not covered by mirrors?

Certain page types aren’t included in the mirrors:

  • Smart answer pages (as the govuk-mirror crawler doesn’t support following form links)
  • CSV preview pages

Troubleshooting

Check the logs of the govuk-mirror-sync job in Argo to see there are any errors during crawling, saving pages or uploading to S3.

Check buckets in AWS S3 or GCP to see if they are populated.

You can fetch pages directly from the mirrors by specifying the Backend-Override header, e.g. curl -H 'Backend-Override: mirrorS3' https://www.gov.uk. The allowed values are mirrorS3, mirrorS3Replica and mirrorGCS.