Last updated: 13 Jan 2026

search-api: Indexing

Elasticsearch, the search engine operated by Search API, stores documents in indexes.

This document describes how documents are indexed (added to Elasticsearch indexes).

Nomenclature

Link: Either the base path for a content item, or an external link.
Document: An elasticsearch document, something we can search for.
Document Type: An elasticsearch document type specifies the fields for a particular type of document. All our document types are defined in config/schema/elasticsearch_types
Index: An elasticsearch search index. Search API maintains separate indices (government and govuk), but searches return documents from all of them.
Index Group: An alias in elasticsearch that points to one index at a time. This allows us to rebuild indexes without downtime.

How documents get added to the search indexes

There are two ways documents get added to a search index:

HTTP requests to Search API's Documents API (deprecated)
Search API subscribes to RabbitMQ messages from the Publishing API.

Search API search results are weighted by popularity. We rebuild the index nightly to incorporate the latest analytics.

Publishing API integration

Search API subscribes to a RabbitMQ queue of updates from publishing-api. This still requires Sidekiq to be running.

bundle exec rake message_queue:insert_data_into_govuk

There is also a separate process that listens to only 'links' updates from the publishing API. This is used for updating old indexes that are populated through the '/documents' API (government) and can be removed once those indexes no longer exist.

bundle exec rake message_queue:listen_to_publishing_queue

Internal only APIs

There are some other APIs that are only exposed internally:

content-api.md for the /content/* endpoint.
documents.md for the */documents/ endpoint.

These are used by search admin.

Schemas

See schemas for more detail.

Changing the schema/Reindexing

After changing the schema, you'll need to recreate the index. This reindexes documents from the existing index.

SEARCH_INDEX=all bundle exec rake search:migrate_schema

Representing parts and attachments

Parts are subpages within a single GOV.UK content item, each with its own title, body, and slug. They allow one piece of content to be split into multiple sections (e.g., /parent/section-name) without creating separate content items. The Search API indexes both parts and HTML attachments using the same parts field, treating them as additional sections of the main document. Instead of handling HTML attachments like file downloads, they are stored as extra “parts,” keeping everything in one indexed document while still allowing each attachment’s title and body to be searchable.