Table of contents

CDN & Caching

Our content delivery network (CDN)

GOV.UK uses Fastly as a CDN. Citizen users aren’t accessing GOV.UK servers directly, they connect via the CDN. This is better because:

  • The CDN “edge nodes” (webservers) are closer to end users. Fastly has servers all around the world but our “origin” servers are only in the UK.
  • It reduces load on our origin. Fastly uses Varnish to cache responses.

The CDN is responsible for retrying requests against the static mirror.


CDN configuration

Most of the CDN config is versioned and scripted:

These are deployed to integration, staging and production.

Some configuration isn’t scripted, such as logging. The www, bouncer and assets services sends logs to S3 and stream them to monitoring-1. These logging endpoints are configured directly in the Fastly UI. There is documentation on how to query the CDN logs.

Fastly Caching

The main cache is Varnish, which Fastly run for us.

Varnish lets us configure our caching logic with VCL (Varnish config language).

It also lets us do fancy things, like only allowing connections to staging from permitted IPs, forcing SSL and blocking IP addresses, among other things.

We set a default TTL of 5000s on cached objects. This means that pages such as the GOV.UK homepage will be cached for 83 mins. 5XX responses get cached for 1s; mirror responses get cached for 15 minutes.

We also set a grace period of 24 hours. So if the homepage server is down, we’ll continue to serve a stale homepage for 24 hours.

We will cache any non-GET/HEAD request that returns a 404 or 405 status for the default TTL. This means (for example) that a POST request that returns a 405 (Method Not Allowed) will be cached.

These are the GET request status codes that Varnish caches automatically: 200, 203, 300, 301, 302, 404 or 410. See the Varnish docs for more detail. We have added to these: see the repo VCL for special handling of certain status codes, and for the most up-to-date version of what we’re running in Fastly. Refer to the Varnish 2.1 documentation when looking at the VCL code.

Testing VCL

VCL can be tricky to get right. When making changes to the VCL, add smoke tests to smokey and check that they don’t fail in staging.

You can also use Fastly’s Fiddle tool to manually test, and you can also test your changes with cURL by including a debug header:

curl -svo /dev/null -H "Fastly-Debug:1"

This will give you various debugging headers that may be useful:

< Fastly-Debug-Path: <nodes you hit>
< Fastly-Debug-TTL: <nodes with TTL>
< Fastly-Debug-Digest: <hash>
< X-Served-By: <node that responded>
< X-Cache: HIT, HIT
< X-Cache-Hits: 1
< X-Timer: <time it took>
< Vary: Accept-Encoding, Accept-Encoding

See the Varnish/Fastly docs for what these mean. Check out the Fastly debugging guide for more details on testing.

Fastly’s IP ranges

Fastly publish their cache node IP address ranges as JSON from their API. We use these IP addresses in 2 places:

  • Origin has firewall rules in place so that only our office and Fastly can connect.
  • Our Fastly Varnish config restricts HTTP purges to specific IP addresses (otherwise anyone would be able to purge the cache).

Banning IP addresses at the CDN edge

We occasionally decide to ban an IP address at our CDN edge if they exhibit the following behaviour:

  • not respecting our robots.txt directives
  • repeatedly receiving 429 (rate limit) error responses from origin and not slowing down
  • making suspicious requests like attempting SQL injection queries

Banning IPs shouldn’t be taken lightly as IP address can be shared my multiple user devices and the user behind an IP address can change over time, so there’s always a chance that we may block a legitimate user when we ban IP addresses.

You can change the list of banned IP addresses by modifying the YAML config file and deploying the configuration.

Bouncer’s Fastly service

A Fastly CDN service can normally handle up to 1000 domains (this limit was undocumented).

We have asked them to increase this limit for Bouncer’s service a few times as the number of domains it handled grew, and the limit is currently 3500. We have about 2000 domains so shouldn’t need to increase it again for a while.

If we reach the limit then the Jenkins job to update Bouncer’s CDN config should fail and new domains won’t be added to the service.

Configuring a new site in Transition generally adds at least 4 domains to the service, including the aka domain for each real domain. For example:


New solution for Bouncer and Fastly

Fastly’s new solution to get around the domain limit is a “service pinned map”.

They have created a map which we access using Domains that need to be transitioned can CNAME to this domain. It also has 4 IP addresses assigned, which at the time of writing are the same as the A records at that hostname:


Domains do not need to be added to the “Production Bouncer” Fastly service like they used to be.

This page was last reviewed on 10 July 2019. It needs to be reviewed again on 10 January 2020 by the page owner #govuk-2ndline .
This page was set to be reviewed before 10 January 2020 by the page owner #govuk-2ndline. This might mean the content is out of date.