Table of contents
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

mongod replication lag

Investigating the problem

There is a Fabric task to show various MongoDB replication status information.:

fab <environment> -H api-mongo-[n].api mongo.status
  • The db.printReplicationInfo() section shows where the primary node’s oplog is up to.
  • The db.printSlaveReplicationInfo() section shows where each secondary is synced to and how far behind the master it is.
  • The rs.status() section shows the current status of each node and the last heartbeat error message for the secondaries.

Possible fixes

Be mindful that load on the primary mongo node may be increased by the replication and consider to limit restarts to one node at a time.

  • Try restarting one of the lagging mongod secondaries:

    fab <environment> -H api-mongo-[n].api app.restart:mongodb
    

This may restart replication on that node, and also cause the other lagging node to resync with the primary node and restart its own replication.

  • If restarting doesn’t solve the problem force a resync with Fabric:

    fab <environment> -H api-mongo-[n].api mongo.force_resync
    
This page was last reviewed on 17 April 2019. It needs to be reviewed again on 17 October 2019 by the page owner #govuk-2ndline .
This page was set to be reviewed before 17 October 2019 by the page owner #govuk-2ndline. This might mean the content is out of date.