Table of contents
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

‘mongod replication lag’

Investigating the problem

There is a fabric task to show various mongo replication status information.:

fab <environment> -H mongo-?.backend mongo.status
  • The db.printReplicationInfo() section shows where the primary node’s oplog is up to.
  • The db.printSlaveReplicationInfo() section shows where each secondary is synced to and how far behind the master it is.
  • The rs.status() section shows the current status of each node and the last heartbeat error message for the secondaries.

Possible fixes

  • Try restarting one of the lagging mongod secondaries:

    fab <environment> -H mongo-?.backend app.restart:mongodb

This may restart replication on that node, and also cause the other lagging node to resync with the primary node and restart its own replication.

  • If restarting doesn’t solve the problem force a resync with fabric:

    fab <environment> -H mongo-?.backend mongo.force_resync
This page is owned by #2ndline and needs to be reviewed