Skip to main content
Last updated: 22 Sep 2022

2nd line drills

There are a number of areas that are important to drill on 2nd line. This is to make developers familiar with the process, as well as to validate that the drill steps continue to work.

Drill detaching an instance

Follow the Detaching an instance from an Auto Scaling Group guidance.

Drill publishing emergency banner

Follow the Deploy an emergency banner on Staging.

You’ll need to choose a non-serious and clearly fake news headline. For example:

  • CAMPAIGN_CLASS: Death of a notable person
  • HEADING: Henry Fielding dies
  • SHORT_DESCRIPTION: English novelist and dramatist known for his earthy humour and satire dies, age 47
  • LINK: https://en.wikipedia.org/wiki/Henry_Fielding
  • LINK_TEXT: More information

Drill an end to end incident

Decide on a hypothetical incident scenario, e.g. “GOV.UK is down”. Walk through the incident management guidance. Use common sense when following the steps (i.e. don’t actually publish an incident to Statuspage or email stakeholders).

Deploy from AWS CodeCommit when Github is unavailable

Choose an app and decide on an old release tag or branch to deploy. Follow the Deploying from AWS CodeCommit instructions in the Integration or Staging environment.

Run a Terraform plan

Follow the Deploy Terraform instructions, picking a project at random. You can run this in any environment, as you’re only running plan - not apply - so shouldn’t be making any changes.

Update homepage promotion slots

Follow the Update homepage promotion slots instructions, using an appropriate image and text. Do this on Integration or Staging.

Use a restored database in an app

Follow the Restore an RDS instance via the AWS CLI instructions for an app of your choice, on Integration or Staging.

Force failover to GOV.UK mirror and Emergency publishing using the GOV.UK mirror

  1. Warn in #govuk-2ndline-tech that you’re about to do this, as it will lead to a spike in alerts and will also break continuous deployment for a while (due to Smokey failures).
  2. Follow the Forcing failover to the GOV.UK mirrors instructions on Integration or Staging.
  3. To verify that it worked, visit a page at random and purge the page from cache. Reload the page, to see the ‘mirrored’ version of the content. NB: you wouldn’t do this in a real incident, as we’d want to serve Fastly’s cached version for as long as possible.
  4. Undo your changes to have Nginx handling requests again.

Drill logging into accounts

Make sure you can log into the following accounts:

  1. Your individual Fastly account
  2. Your individual Statuspage account
  3. Your individual Logit account
  4. Shared Heroku account
  5. Shared CKAN account
  6. Shared Rubygems account
  7. Shared NPM account

Drill how to communicate when Slack is down

Ensure you know how to communicate with your 2nd line colleagues if Slack is unavailable. See “If Slack is unavailable” for details.

Drill scaling up number of workers

In preparation for a spike in traffic, you can increase the number of unicorn workers for an app. See established connections exceeded for details.

Pick an application and drill scaling up the number of workers - see example.

You can create a branch of govuk-puppet and deploy that branch to Integration to see the unicorn worker change take effect. Delete the branch and re-deploy the latest release of Puppet when you’re done.

Drill enabling a code freeze

Choose a continuously-deployed app where you can make a meaningful change to the default branch, e.g. fixing a typo, or merging a Dependabot PR.

Either before merging the change, or part way through the continuous deployment process, follow the instructions for implementing a deploy freeze for that app.

Follow the deployment pipeline in Jenkins. Confirm that no further environment deployments are triggered. For example, if you implemented the deploy freeze just after the app was deployed to Staging, confirm that the app was then not automatically deployed to Production.

Remove the code freeze, then manually push the changes to all remaining environments so that they’re in sync.