Skip to main content
Last updated: 28 Mar 2025

GOV.UK's sitemap

GOV.UK’s sitemap is available at https://www.gov.uk/sitemap.xml. GOV.UK is far too big to fit into one sitemap, so this file is more of a ‘sitemap index’, which references around 30 other XML files, such as https://www.gov.uk/sitemaps/sitemap_1.xml.

How the sitemap is generated

Every morning, a search-api-generate-sitemap cronjob runs to generate a fresh sitemap.

The cronjob runs the sitemap:generate_and_upload rake task in search-api. This enumerates over all documents in Search API and generates a sitemap matching the format specified in https://www.sitemaps.org/protocol.html. This job also creates the sitemap index.

The sitemap generator is configured to search for documents across all of Search API’s indexes.

Indexes

search-api-v2 has no concept of an ‘index’. search-api, on the other hand…

Documents are spread across three ‘indexes’ in Search API:

  • govuk: the index populated by Publishing API, intended to encapsulate all GOV.UK content
  • government and detailed - the remaining legacy ‘content indexes’, encapsulating some Whitehall content and Detailed Guides respectively.

There are two Search API ADRs documenting the decision to move to one govuk index: ADR-04 and ADR-06. Some legacy indexes (e.g. mainstream) have been fully migrated into it, but the two legacy indexes listed above remain.

One can find out which index a piece of content is saved under, using Search API’s API: see "index": "government" on this example.