Skip to main content
Table of contents
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

mongod replication lag

Investigating the problem

There is a Fabric task to show various MongoDB replication status information.:

fab <environment> -H api-mongo-[n].api mongo.status
  • The db.printReplicationInfo() section shows where the primary node’s oplog is up to.
  • The db.printSlaveReplicationInfo() section shows where each secondary is synced to and how far behind the master it is.
  • The rs.status() section shows the current status of each node and the last heartbeat error message for the secondaries.

Possible fixes

Be mindful that load on the primary mongo node may be increased by the replication and consider to limit restarts to one node at a time.

  • Try restarting one of the lagging mongod secondaries:

    fab <environment> -H api-mongo-[n].api app.restart:mongodb
    

This may restart replication on that node, and also cause the other lagging node to resync with the primary node and restart its own replication.

  • If restarting doesn’t solve the problem force a resync with Fabric:

    fab <environment> -H api-mongo-[n].api mongo.force_resync
    
This page was last reviewed on 11 February 2020. It needs to be reviewed again on 11 August 2020 by the page owner #govuk-2ndline .
This page was set to be reviewed before 11 August 2020 by the page owner #govuk-2ndline. This might mean the content is out of date.