Table of contents
This page was set to be reviewed before 2018-04-27 by the page owner: #govuk-2ndline. This might mean the content is out of date. Read how to review a page.

Reindex an Elasticsearch index

After updating an Elasticsearch index’s schema by changing the fields or document types, you need to reindex the affected index before the new fields and types can be used.

The reindexing process:

  1. Locks the Elasticsearch index to prevent writes to the index while data is being copied
  2. Creates a new index using the schema defined in the deployed version of rummager
  3. Copies all the data from the old to the new index
  4. Compares the old and new data to check for inconsistencies
  5. If everything looks the same, switches the alias to the new index

How to reindex an Elasticsearch index

Do not reindex on production during working hours except in an emergency. Reindexing locks the index for writes, so content is not updated in the search index. See the Replay traffic section below if you need to run a reindexing during working hours.

To reindex, run the rummager:migrate_schema rake task:

bundle exec rake rummager:migrate_schema CONFIRM_INDEX_MIGRATION_START=1 RUMMAGER_INDEX=alias_of_index_to_migrate

If you set the last parameter to RUMMAGER_INDEX=all, rummager will reindex all the indices sequentially.

You can run this task from Jenkins, but it will block other rake tasks from being run for 15 minutes to an hour. You can avoid this by running the command directly on a search machine, but you need to prefix the command with govuk_setenv rummager to make sure the Elasticsearch hostname is set correctly.

To monitor progress, SSH to an Elasticsearch box with port-forwarding:

ssh -L9200:localhost:9200 rummager-elasticsearch-1.api.staging

Then visit http://localhost:9200/_plugin/head/ to check how many documents have been copied to the new index.

Replay traffic

This step is only necessary if you ran reindexing job during working hours, which means that content published in whitehall will be missing from search.

See Replaying traffic to correct an out of sync search index for details.

Cleanup

Reindexing does not delete the old index. This lets us switch back to the old index if there is a serious problem with the new one.

Once you’re confident that the reindexing was successful, delete the old (unaliased) index using the rummager rake task:

rake rummager:clean RUMMAGER_INDEX=alias_of_index_to_clean_up

Avoid leaving old indices around for more than a few days. Rummager performance starts to degrade once there are more than three or four old indices in the cluster.

Troubleshooting

To stop the reindexing job

If you need to cancel the reindexing while it’s in progress:

  1. Stop the reindexing rake task
  2. Unlock the old index by running the rummager rake task: rake rummager:unlock RUMMAGER_INDEX=alias_of_index_to_unlock

This doesn’t actually stop the reindexing, because reindexing is an internal Elasticsearch progress triggered by the rake task. It will stop the rake task from switching the alias over to the new index once it has copied all the data, which is normally good enough.

If you need to stop the reindexing process itself, for example because Elasticsearch is about to run out of disk space, port-forward to the rummager-elasticsearch box (see above) then use http://localhost:9200/_plugin/head/ to send these requests to Elasticsearch:

  1. Find the ID of the reindexing task:

    GET /_tasks?actions=%2Areindex
    
  2. Stop the task:

    POST /_tasks/{task_id}/_cancel
    

To switch back to the old index

If you discover a problem after reindexing and need to switch back to the old index, run this rummager rake task:

rake rummager:switch_to_named_index[full_index_name] RUMMAGER_INDEX=index_alias

where full_index_name is the full name of the new index, including the date and UUID, e.g. govuk-2018-01-29t17:08:21z-31f39bdb-c62b-4607-8081-19ea87fb1498.

Switching back to an old index means that you’ll lose any content updates that were published while the new index was live. To fix this:

  1. Replay traffic from whitehall
  2. Republish other content using the publishing-api rake task:

    rake 'represent_downstream:published_between[2018-01-04T09:30:00, 2018-01-04T10:00:00]'
    
This page was set to be reviewed before 2018-04-27 by the page owner: #govuk-2ndline. This might mean the content is out of date. Read how to review a page.