Before you start on 2nd line
Before you start your shift you’ll need access to the accounts we use and our communication channels.
The 2nd line dashboard can be viewed here.
We use Icinga to monitor our platform and alert us when things go wrong. Please ensure you can access (if remote you will need to be on the VPN):
Zendesk is our support ticketing system. Create an account if you don’t have one yet. Then ask a fellow 2nd liner to add a new ticket assigned to 2nd/3rd Line—Zendesk Administration asking to give you access to 2nd Line - GOV.UK Alerts and Issues.
This is our escalation workflow for some key alerts that are likely to require urgent attention. The escalation order is:
- Primary Engineer
- Secondary Engineer
- Designated Programme team member (might not be technical)
This mirrors our out of hours on-call escalation order, so 2nd line can be thought of as in-hours on-call. 2nd line Shadows are not required to be on PagerDuty. If you’re on Primary or Secondary, please check you can sign in to PagerDuty. Speak to the Delivery Manager if you cannot sign in.
When an alert that triggers PagerDuty goes off, someone on the escalation schedule must acknowledge them, otherwise they will be escalated further. PagerDuty is for key aspects of the site becoming unavailable or a large quantity of error pages being served.
There is a PagerDuty drill every Wednesday morning at 10am UTC. You will be called by PagerDuty and must escalate the incident to the next person in the escalation order. When you receive this call do not acknowledge it, instead escalate it so that each person in the workflow can be alerted.
You can also find out what to do if there’s an incident.
Follow these Slack channels while you’re working on 2nd line:
- #govuk-2ndline - the main channel for people on 2nd line
- #govuk-deploy - every time a Staging/Production deploy is done, this is automatically posted to - people also manually post when putting branches on Integration for testing
- #govuk-developers - this is a general channel for developers and can be a good place to ask questions if you are struggling
- #reliability-eng - to Slack the RE interruptible person about urgent GOV.UK infrastructure issues