Skip to main content
Last updated: 21 Nov 2024

Raise issues with Reliability Engineering

You may experience an issue with GOV.UK where you need help from a Site Reliability Engineer (SRE). The SREs generally work on the Platform Engineering team.

If you require assistance

Ask in #govuk-platform-engineering.

Understanding what SREs can assist with

There is a broad explanation of the different areas of support in GOV.UK in ask for help.

SREs can help with:

  • Scalability and resilience
  • Designing or improving monitoring, metrics, tracing and observability of system behaviour
  • Troubleshooting complex problems
  • Designing new systems or backend (APIs, information storage and processing) features
  • Designing for graceful degradation under failure conditions (for example the GOV.UK static mirrors)
  • Migrating from legacy systems, for example GOV.UK Puppet
  • Upgrading software packages that are end-of-life/have security issues/no longer fit for purpose
  • Advice on how to structure or maintain Terraform modules for managing cloud resources
  • Continuous deployment and continuous delivery systems (CI/CD), build and release automation