How we run RabbitMQ
We run a RabbitMQ cluster, which is used to trigger events when documents are published. The general process is that messages are published onto “exchanges” in RabbitMQ. Applications create “queues” which listen to the exchanges, and gather the messages sent to the exchanges together. Applications then run “consumers” which receive messages from the queues.
In order to ensure that our consumers remain active, we publish “heartbeat” messages to the exchanges every minute. This helps to avoid problems with consumers dropping their connections due to inactivity, but also allows us to monitor activity easily.
Heartbeat messages are sent every minute by cron. Currently, we only
send heartbeat messages to one exchange: the
exchange. These heartbeats are sent via a rake task in the
Connecting to the RabbitMQ web control panel
- Create an SSH tunnel to access the web control panel
For example for integration:
ssh rabbitmq-1.backend.integration -L 15672:127.0.0.1:15672
- Log in to the web control panel
Point your browser at http://127.0.0.1:15672
The username is root. The password you can obtain from the deployment repo. Look for govuk_rabbitmq::root_password in the file for the relevant environment in:
- Do your business
- Tidy up
Close the SSH connection you set up earlier with CTRL+C or by typing “exit”.
Inspecting/removing items from a queue
We had an instance where an application was unable to process a message in the queue, but left the message on the queue. This meant it was backing up. Removing the message from the queue was the right solution in our case.
- Find the queue
Click on the “Queues” tab. Then click on the name of the queue.
- Find the messages
Scroll down and click “Get messages”. Clicking the “Get Message(s)” button that appears will fetch however many messages you ask for.
Note: Fetching messages actually removes them from the queue. By leaving the “Requeue” option set to “Yes”, they will be added back to queue.
- Delete the messages
Note: there is a risk that you might delete the wrong message(s). This is because the contents of the queue may have changed.
Repeat, but change the “Requeue” option to “No”.