Skip to main content
Warning This document has not been updated for a while now. It may be out of date.
Last updated: 1 Sep 2023

search-api: Adding new fields to a document type

The schema

config/schema contains a bunch of JSON files that together define a schema for documents in Search API. This is described in more detail in the README.

First you need to decide which field type to use. field_types.json defines common elasticsearch configuration that we reuse for multiple fields having the same type.

The type you use affects whether the field is analysed by elasticsearch and whether you can use it in [filters][] and aggregates.

Add your new field to field_definitions.json.

If your field should be valid for any kind of document, you can add it to base_elasticsearch_type.json. Otherwise, add it to the appropriate JSON file under elasticsearch_types.

Integration testing

The easiest way to test the new fields is to write an integration test for it. These tests run against a development Elasticsearch cluster, and create new search indices each test run.

Transformation during indexing

Some fields get transformed by Search API before they are stored in Elasticsearch. This is handled by the DocumentPreparer class.

Some fields get expanded by Search API when they are presented in search results. For example, specialist_sector links get expanded by looking up the corresponding documents from the search index and extracting title, content id, and link fields. This is handled by Search::BaseRegistry.

Updating Search API schema indexes on all environments

Caution: Do not run this rake task in production during working hours except in an emergency. Content published while the task is running will not be available in search results until the task completes. The impact of this can be reduced if you run the task out of peak publishing hours.

In order for the new field to work as expected, you will need to run a Jenkins job on all environments. The job is "Search reindex with new schema" (Link to integration version of task), and will run the search:migrate_schema rake task. It can take over 2 hours to complete.

This job will block other rake tasks from being run for 15 minutes to an hour.

Read more about re-indexing the elasticsearch indexes here.

If you consider the change low risk and are only adding new fields for which content doesn't yet exist, you can run the search:update_schema task. This task will attempt to update the Elasticsearch index schema in place without requiring a re-index. If you have made any changes which affect existing fields, Elasticsearch will reject the change and a full re-index will be required.

Troubleshooting

The new field doesn't show up

For the new elasticsearch configuration to take effect, you need to manually rebuild the search indexes.

In the past, this was done automatically every night by the search_fetch_analytics jenkins job, but this automation was reverted. You must run this manually.

If you prefer running a rake task rather than a pre-written Jenkins job, you can run SEARCH_INDEX=all search:migrate_schema.