Raise issues with Reliability Engineering
When on Technical 2nd Line you may experience an issue with GOV.UK that requires asking the Site Reliability Engineers (SREs) who work on GOV.UK infrastructure for assistance. The SREs previously worked in the RE GOV.UK team in Reliability Engineering, but currently they mostly work as part of the Platform Engineering team. It is best to use RE GOV.UK channels for communication.
There are Reliability Engineering docs for users of their systems. There are also other Reliability Engineering docs for use by the team, these may contain more technical details.
If you require assistance
#govuk-platform-reliability or in
If a problem is not urgent
If the issue you’ve identified seems like a non-urgent story you can add it the GOV.UK Technical 2nd Line trello board in the “Ongoing issues to be aware of & unexplained events” column. The Technical 2nd Line tech lead(s) will then decide whether to pass this on to another team, manage the ticket through its life cycle, or to resolve this problem themselves.
Understanding what SREs can assist with
There is a broad explanation of the different areas of support in GOV.UK in ask for help.
More specificially to GOV.UK, SREs can help with:
- GOV.UK Puppet
- Upgrading software packages that are end-of-life/have security issues/no longer fit for purpose
- Running and maintaining the Terraform configurations for AWS
- Maintaining the mirror configuration
- Keeping the CI environment running (GOV.UK are responsible for job configuration)