This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

Smokey loop tests

Smokey runs in a continuous loop in each environment. and dumps the output of each run into a tmp/smokey.json file. We have Icinga checks for most Smokey features, so that we are alerted when some aspect of GOV.UK may be in trouble.

When a test fails, you should see a “Smokey loop for <feature>” alert. The alert description should contain the reason for the failure, so you can diagnose the problem.

NOTE: we have a separate “Smokey” alert for manual runs of the Smokey job in Jenkins. This alert covers all Smokey features, while “Smokey loop” alerts are more granular.


If many of the tests are failing in an AWS environment, it may be because the Nginx services haven’t registered new boxes coming online or old ones going offline. You can try to restart the following services:

$ fab $environment class:cache app.reload:nginx
$ fab $environment class:draft_cache app.reload:nginx
$ fab $environment class:monitoring app.reload:nginx
$ fab $environment class:monitoring app.restart:smokey-loop

Traceback (most recent call last): or /tmp/smokey.json is older than 30m

If you see this error in Icinga, it may mean that the smokey-loop process has died. You can try looking through the logs or restarting the process.

$ ssh monitoring-1.production
> sudo less /var/log/upstart/smokey-loop.log

HTTP status code 550 (RestClient::RequestFailed)

This usually means that the BrowserMob Proxy java process is running as part of a previously aborted smokey-loop and the new smoke tests cannot start a new proxy. It’s necessary to kill the existing java process and restart smokey-loop.

Replace process numbers as appropriate:

$ gds govuk connect -e production ssh aws/monitoring
> sudo service smokey-loop stop
$ ps -ef | grep java
> smokey    6385  6380 26 14:58 ?        00:00:54 java -Dbasedir=/opt/smokey -jar /opt/smokey/lib/browsermob-dist-2.1.4.jar --port 3222
$ sudo kill -9 6385
$ sudo service smokey-loop start

Smokey user

These tests rely on a user in GOV.UK Signon. All Signon users have their passphrase expire periodically. This will cause the tests to fail.

You should change the passphrase of the account and rotate it in encrypted hieradata. Here’s an example PR in govuk-secrets.

This page was last reviewed on 8 July 2020. It needs to be reviewed again on 8 January 2021 by the page owner #govuk-2ndline .
