Skip to main content
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert
Warning This document has not been updated for a while now. It may be out of date.
Last updated: 13 May 2021

Email Alert API: Unprocessed work

This alert indicates that Email Alert API has work that has not been processed in the generous amount of time we expect it to have been. Which alert you see depends on the type of work.

Each of the alerts is based on custom metrics that we collect using a periodic job. The metric will be something like “amount of unprocessed work older than X amount of time” (example).

Automatic recovery

Sometimes we lose work due to a flaw with the Sidekiq queueing system. In order to cope with this scenario, a RecoverLostJobsWorker runs every 30 minutes, and will try to requeue work that has not been processed within an hour. If work is being repeatedly lost, the alert will fire and you’ll need to investigate manually.

Manual steps to fix

Things to check:

If all else fails, you can try running the work manually from a console. The automatic recovery worker code is a good example of how to do this, but you will need to use new.perform instead of perform_async.

A digest run may be “complete” - all work items generated, all work items processed - but not marked as such. In this case, you will need to use slightly different commands to investigate the incomplete run:

# find which digests are "incomplete"
DigestRun.where("created_at < ?", 1.hour.ago).where(completed_at: nil)

# try manually marking it as complete
DigestRunCompletionMarkerWorker.new.perform