Table of contents

Replay traffic to correct an out-of-sync search index

If the data in the search index is out-of-sync with the Publishing API, (for example, after restoring a backup), then any publish and unpublish messages that have not been processed need to be resent.

govuk index

Content in the govuk index is populated from the Publishing API message queue. Missing documents can be recovered by resending the content to the message queue. In the Publishing API, run the following rake task (including the quotes) to replay traffic between two datestamps:

bundle exec rake 'represent_downstream:published_between[2018-12-17T01:02:30, 2018-12-18T10:20:30]'

Other replay options are available, for example replaying all traffic for a single publishing app or doctype. Be aware that these options will replay the entire Publisher API history for that app or doctype, and may take some time.

government/detailed indexes

This will not be neccessary after whitehall content has been moved to the govuk index.

These indexes are populated by whitehall calling an HTTP API in Search API.

We have also setup Gor logging for POST and GET requests so that we can replay the traffic.

The logs are stored on the search-api servers. You will need to run the replay on each server.

The location of the logs is:

/var/log/gor_dump

You must copy the file for the restore, as the restore requests will be logged to the file.

The following command can be used to run the restore:

$ sudo goreplay -input-file "20171031.log|1000%" -stats -output-http-stats -output-http "http://localhost:3233/|6000%" -verbose

This runs the restore at 10x the speed it was saved so each hour of logs takes 6 minutes to process.

This page was last reviewed on 27 March 2019. It needs to be reviewed again on 27 June 2019 by the page owner #govuk-2ndline .
This page was set to be reviewed before 27 June 2019 by the page owner #govuk-2ndline. This might mean the content is out of date.