Backup and restore Elasticsearch indices
GOV.UK uses AWS Managed Elasticsearch which takes daily snapshots of
the cluster as part of the managed service. These are stored in a S3
bucket that is not made available to us. Restoration is done by
making HTTP requests to the _snapshot
endpoint.
To restore a snapshot, follow these steps:
SSH to a
search
box:gds govuk connect ssh -e integration search
Query the
_snapshot
endpoint of Elasticsearch to get the snapshot repository name:govuk_setenv search-api \ bash -c 'curl "$ELASTICSEARCH_URI/_snapshot?pretty"'
Query the
_all
endpoint to identify the available snapshots in the named repository:govuk_setenv search-api \ bash -c 'curl "$ELASTICSEARCH_URI/_snapshot/<repository-name>/_all?pretty"'
If an index already exists with the same name as the one being restored, delete the existing index:
govuk_setenv search-api \ bash -c 'curl -XDELETE "$ELASTICSEARCH_URI/<index-name>"'
Restore the index from the snapshot:
govuk_setenv search-api \ bash -c 'curl -XPOST -H 'Content-Type: application/json' "$ELASTICSEARCH_URI/_snapshot/<repository-name>/<snapshot-id>/_restore" -d "{\"indices\": \"<index-name>\"}"'
Further information about Elasticsearch snapshots can be found in the AWS documentation
After a restore has taken place, you will need to fix the out-of-date search indices following the restore, since any changes made in publishing apps since the backup was taken will be missing.