Add new fields or document types to search
config/schema contains a bunch of JSON files that together define a schema for documents in rummager. This is described in more detail in the README.
First you need to decide which field type to use.
field_types.json defines common elasticsearch configuration that we reuse for multiple fields having the same type.
Add your new field to
If your field should be valid for any kind of document, you can add it to
base_elasticsearch_type.json. Otherwise, add it to the appropriate JSON file under
The easiest way to test the new fields is to write an integration test for it. These tests run against a development Elasticsearch cluster, and create new search indices each test run.
Transformation during indexing
Some fields get transformed by rummager before they are stored in Elasticsearch. This is handled by the
Presenting for search
Some fields get expanded by rummager when they are presented in search results. For example,
specialist_sector links get expanded by looking up the corresponding documents from the search index and extracting title, content id, and link fields. This is handled by
Updating Rummager schema indexes on all environments
Caution: Do not run this rake task in production during working hours except in an emergency. Content published while the task is running will not be available in search results until the task completes. The impact of this can be reduced if you run the task out of peak publishing hours.
In order for the new field to work as expected, you will need to run a Jenkins job on all environments. The job is “Search reindex with new schema” (Link to integration version of task), and will run the
rummager:migrate_schema rake task. It can take over 40 minutes to complete.
This job will block other rake tasks from being run for 15 minutes to an hour.
The new field doesn’t show up
For the new elasticsearch configuration to take effect, you need to manually rebuild the search indexes.
If you prefer running a rake task rather than a pre-written Jenkins job, you can run
RUMMAGER_INDEX=all SKIP_LINKS_INDEXING_TO_PREVENT_TIMEOUTS=1 rummager:migrate_schema.