Table of contents
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

Fastly error rate for GOV.UK

We get response code reporting from Fastly (with a 15 minute delay). It averages out the last 15 minutes worth of 5xx errors. This is a useful supplementary metric to highlight low-level errors that occur over a longer period of time.

The alert appears on A good starting point for investigation is to examine the Fastly CDN logs.

  • ssh
  • cd /mnt/logs_cdn to access log files

Alternatively you can look in Kibana with the query application:"govuk-cdn-logs-monitor"

Unknown alert

The alert appears on Collectd uses the Fastly API to get statistics which it pushes to Graphite. If the alert is unknown, collectd likely cannot talk to Fastly so restart collectd.

$ sudo service collectd restart

To prove collectd is the problem, use this query in Kibana: AND syslog_program:collectd

You will see many reports similar to:

cdn_fastly plugin: Failed to query service

More about Icinga alerts

This page was last reviewed . It needs to be reviewed again by the page owner #govuk-2ndline.