Restore Elasticsearch indices from backup

Warning This document has not been updated for a while now. It may be out of date.

Last updated: 19 Jun 2023

Background

AWS Managed Elasticsearch automatically takes hourly snapshots for backup and disaster recovery purposes. The snapshot data is stored in an Amazon-owned S3 bucket that is not directly available to us via S3 but is configured as an Elasticsearch snapshot repository called cs-automated-enc.

Restores are done via the Elasticsearch API, by making HTTP requests to the _snapshot endpoint.

We also have a govuk-production snapshot repository, which is normally only used for copying indices from production to the non-production environments.

Restore a specific index from a snapshot

List the available backup snapshots in the cs-automated-enc snapshot respository and identify the snapshot that you want to restore from.
```
k exec deploy/search-api -- \
  sh -c 'curl "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/_all?pretty"'
```
This can take a few seconds.
If an index already exists with the same name as the one you want to restore, delete the existing index.
```
k exec deploy/search-api -- sh -c 'curl -XDELETE "$ELASTICSEARCH_URI/<index-name>"'
```

Restore the index from the snapshot. Fill in <snaphot-id> and <index-name> as appropriate.

k exec deploy/search-api -- \
  sh -c 'curl -XPOST -H 'Content-Type: application/json' "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/<snapshot-id>/_restore" -d "{\"indices\": \"<index-name>\"}"'

The restore can take a few minutes. The /_cat/recovery resource gives an indication of progress.
```
k exec deploy/search-api -- sh -c 'curl "$ELASTICSEARCH_URI/_cat/recovery"'
```
Once the restore has finished, reprocess any content changes that happened after the backup.

The reprocessing step is necessary in order to bring the restored index up to date, because GOV.UK’s indexing is incremental only. In other words, there is no regular full reindex.

Restore all indices from a snapshot

Restoring all indices is a similar procedure to restoring a specific index.

Identify the snapshot to restore. See step 1 above.

Delete all indices.

k exec deploy/search-api -- sh -c 'curl -XDELETE "$ELASTICSEARCH_URI/_all"'

Restore all indices from the snapshot.

k exec deploy/search-api -- \
  sh -c 'curl -XPOST "$ELASTICSEARCH_URI/_snapshot/cs-automated-enc/<snapshot-id>/_restore"'

Once the restore has finished, reprocess recent content changes to bring the indices up to date. See steps 4 and 5 above.

Restore Elasticsearch indices from backup

Background

Restore a specific index from a snapshot

Restore all indices from a snapshot

Further reading

More in the Backups section

Learn

How to...