Table of contents

Search API field reference

Documents in an elasticsearch index have a type, and each type can have different fields.

In Search API, the field is named elasticsearch_type to avoid confusion with content_store_document_type.

All fields

all_searchable_text
all_searchable_text_type
Special field which searchable text is copied into; similar to the standard `_all` field, but more customisable
exact_query
best_bet_exact_match_text
Field used in the best-bet implementation to perform an exact match between a user’s query and a stored best-bet
stemmed_query
best_bet_stemmed_match_text
Field used in the best-bet implementation to perform a stemmed match between a user’s query and a stored best-bet
has_act_paper
boolean
Used for official document status filter on publication page (publication formats only).
has_command_paper
boolean
Used for official document status filter on publication page (publication formats only).
has_official_document
boolean
Used for official document status filter on publication page (publication formats only).
important_to_policy
boolean
A flag set by editors to mark a document as being important to policy
is_historic
boolean
If the content is political and published by a previous government, it is considered historic and not reflecting of the current government
is_political
boolean
If the content is considered political in nature, reflecting views of the government it was published under
is_withdrawn
boolean
If the content has been published but then withdrawn
relevant_to_local_government
boolean
No longer used. Will be removed in a future version.
alert_issue_date
date

assessment_date
date
The date when the assessment was held.
build_end_date
date

build_start_date
date

closed_date
date
Close date of a CMA case
closing_date
date

date_of_occurrence
date
Date of the event described in the document. Only applies to MAIB, RAIB, AAIB reports.
end_date
date
End date for topical content. Assume null means in the past for topical event pages.
first_published_at
date
The date the content was first published. Should be the same as `first_published_at` in the publishing-api.
issued_date
date

opened_date
date
Open date of a CMA case
public_timestamp
date
Time of the last update. Used for 1) weighting more recent editions (mainly whitehall content) more highly, 2) the default sort in finders, 3) showing a date in the search results. Should be the same as `public_updated_at` in the publishing-api.
release_timestamp
date
When statistics will be released, for statistics announcement pages.
start_date
date
Start date for topical content.
tribunal_decision_decision_date
date

updated_at
date
When the page was last updated. This field is unreliable and may be deprecated in a future version.
popularity
float
Popularity indicator used to bias search results. A higher number indicates more pageviews. Updated nightly.
rank_14
float
Field used in the page-traffic index to hold the rank of this page in the list of pages on the website when ordered by traffic. This is a fairly stable value used in ranking calculations.
content_id
identifier
The content_id of the item. This will not be present for all items, as most application do not send it.
content_store_document_type
identifier
The document type as stored in the Content Store
detailed_format
identifier
A slugified version of the display_type field
dfid_review_status
identifier
Whether or not this item is peer reviewed (a code)
display_type
identifier
Different way of describing content_store_document_type, for some formats. May be deprecated in a future version.
email_document_supertype
identifier
High level group for email subscriptions use to identify publications and announcement. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
format
identifier
This field is less specific than content_store_document_type but is mandatory for every document. May be deprecated in future.
government_document_supertype
identifier
Grouping for email subscriptions. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
hmrc_manual_section_id
identifier

id
identifier
This field will be deprecated in a future version. Do not use.
licence_identifier
identifier
The licence ID associated with a content item of type licence
link
identifier
The link to the document. This is usually the path component of the URL of the document, including a leading slash, but sometimes omits the leading slash, and is sometimes an absolute URL of related content which does not appear on GOV.UK
manual
identifier
Base path of the manual this document belongs to. Eg `/service-manual` or `/hmrc-internal-manuals/air-passenger-duty`
navigation_document_supertype
identifier
Navigation document type. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
operational_field
identifier
The place a fatality notice applies to.
organisation_state
identifier
Status of an organisation page on GOV.UK: {live, joining, exempt, transitioning, closed, devolved}
organisation_type
identifier
The organisation type identifier, like ‘ministerial_department’ or 'public_corporation’. Only applies to organisations. Expected value is the `organisation_type_key` from Whitehall, enumerated in https://github.com/alphagov/whitehall/blob/master/app/models/organisation_type.rb.
outcome_type
identifier
Outcome of a CMA case.
publishing_app
identifier
Application that published this page
rendering_app
identifier
Application that renders this page
search_user_need_document_supertype
identifier
Grouping for core or government documents. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
slug
identifier
The trailing part of the link. Two slugs belonging to the same format must be unique.
statistics_announcement_state
identifier
State of a statistical announcement page; one of {cancelled, confirmed, provisional}
stemmed_query_as_term
identifier
Field used in the best-bet implementation to hold the raw form of a stemmed query
tribunal_decision_category
identifier

tribunal_decision_country
identifier

tribunal_decision_landmark
identifier

tribunal_decision_reference_number
identifier

tribunal_decision_sub_category
identifier

user_journey_document_supertype
identifier
Used to distinguish pages used mainly for navigation (finding) from content pages (thing). See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
aircraft_category
identifiers
Used for official document status filter on publication page (publication formats only).
alert_type
identifiers

case_state
identifiers
Whether a case is open or closed. Applies to CMA cases.
case_type
identifiers
Similar to report_type: applies to CMA cases.
contact_groups
identifiers
Contact groups a 'contact’ page belongs to.
country
identifiers
Country associated with the page. Does not link to a page on GOV.UK. See also 'world_locations’.
development_sector
identifiers

document_collections
identifiers
Document collections this page belongs to (https://docs.publishing.service.gov.uk/document-types/document_collection.html).
document_series
identifiers

eligible_entities
identifiers

fault_type
identifiers

faulty_item_model
identifiers

faulty_item_type
identifiers

fund_state
identifiers

fund_type
identifiers

funding_amount
identifiers
Funding (per unit per year)
funding_source
identifiers

grant_type
identifiers
Countryside stewardship grant type (https://www.gov.uk/countryside-stewardship-grants).
land_use
identifiers
Countryside stewardship grant land use type (https://www.gov.uk/countryside-stewardship-grants).
location
identifiers
Location metadata. Not comparable between different formats.
mainstream_browse_page_content_ids
identifiers
As opposed to “mainstream_browse_pages”, this will include the “content_ids” of each mainstream browse pagerather than the slug. It will eventually replace “mainstream_browse_pages”.
mainstream_browse_pages
identifiers
Mainstream browse pages the page can appear in (https://www.gov.uk/browse).
manufacturer
identifiers

market_sector
identifiers
Market sector a CMA case relates to.
medical_specialism
identifiers

organisation_content_ids
identifiers
As opposed to “organisations”, this will include the “content_ids” of each organisation rather than the slug. It will eventually replace “organisations”.
organisations
identifiers
The organisations related to this page. This field is copied from the publishing-api. Note that means different things for different formats.
part_of_taxonomy_tree
identifiers
Any taxon tagged to the document and any of their ancestor taxons
path_components
identifiers
Field used in the page-traffic index to hold the full paths of each component of a url. eg: a document with a path of ’/foo/bar/baz’ would have the values ’/foo’, ’/foo/bar’ and ’/foo/bar/baz’ in this field. This allows all the pages under a given URL component to be identified.
people
identifiers
Links to people associated with this page (https://www.gov.uk/government/people).
policies
identifiers
Policy content related to the page (https://www.gov.uk/government/policies).
policy_areas
identifiers
Policy areas are managed in Whitehall. They’re an old grouping of policies, which we’re expecting to deprecate soon. Formally known as 'topics’.
policy_groups
identifiers
Links to policy groups (working groups) associated with this page (https://www.gov.uk/government/groups).
primary_publishing_organisation
identifiers
The organisation that published this page. This field is copied from the publishing-api. It is only populated by Whitehall.
railway_type
identifiers

report_type
identifiers
Report type for MAIB, RAIB, AAIB reports. Possible values vary by format.
search_format_types
identifiers
Filters used by publications/announcement pages (https://www.gov.uk/government/publications/). May be deprecated in a future version.
serial_number
identifiers

specialist_sectors
identifiers
The navigation “topics” that the document is assigned to. Nothing to do with “policy areas”
taxons
identifiers
Topics associated with the page. Used for topic pages (e.g. https://www.gov.uk/education).
therapeutic_area
identifiers
Therapeutic area (https://www.gov.uk/drug-safety-update).
tiers_or_standalone_items
identifiers
Countryside stewardship grant land tiers or standalone items (https://www.gov.uk/countryside-stewardship-grants).
topic_content_ids
identifiers
The navigation “topics” that the document is assigned to. Nothing to do with “policy areas”. As opposed to “specialist_sectors”, this will include the “content_ids” of each topic rather than the slug. It will eventually replace “specialist_sectors”.
tribunal_decision_categories
identifiers

tribunal_decision_judges
identifiers

tribunal_decision_sub_categories
identifiers

value_of_funding
identifiers

vessel_type
identifiers

world_locations
identifiers
World location associated with this page (https://www.gov.uk/world).
attachments
objects
Metadata associated with any attachments linked to this page.
metadata
opaque_object
The “metadata” field is intended for the storage of additional non-searchable document data. This allows additional information to be stored and displayed in search results without having to make changes to the schema.
dfid_document_type
searchable_identifier
The document type of the output (e.g. Book Chapter, Conference Paper)
tribunal_decision_category_name
searchable_identifier

tribunal_decision_country_name
searchable_identifier

tribunal_decision_landmark_name
searchable_identifier

tribunal_decision_sub_category_name
searchable_identifier

business_sizes
searchable_identifiers
The sizes of business served by a business finance support scheme
business_stages
searchable_identifiers
The stages of business served by a business finance support scheme
dfid_authors
searchable_identifiers
A set of author names, which aren’t selected from a predefined list and don’t repeat
dfid_theme
searchable_identifiers
The broad theme (or themes) to which the output applies
industries
searchable_identifiers
The industries served by a business finance support scheme
tribunal_decision_categories_name
searchable_identifiers

tribunal_decision_judges_name
searchable_identifiers

tribunal_decision_sub_categories_name
searchable_identifiers

types_of_support
searchable_identifiers
The types of support provided by a business finance support scheme
title
searchable_sortable_text
The page title, as diplayed in internal search results. May differ from the page title tag or top level heading.
acronym
searchable_text
Acronym associated with the page title. Used for organisation pages.
aircraft_type
searchable_text
Metadata associated with an AAIB report (https://www.gov.uk/aaib-reports/)
description
searchable_text
A descrition or summary of the page that can be displayed to users in search results.
indexable_content
searchable_text
The main chunk of text that is indexed for the page. This varies by document type/publishing app, and may not be suitable for users to read. HTML tags are stripped out.
licence_short_description
searchable_text

registration
searchable_text
Metadata associated with an AAIB report (https://www.gov.uk/aaib-reports/)
spelling_text
spelling_text
Generated field, populated with the same content as sent to the _all field, but tokenised into words, lowercased and shingled, not stemmed, etc
details
unsearchable_text
Field used in the best-bet implementation to store the modifications to be made to the query when a best-bet is matched
government_name
unsearchable_text
The name of the Government that first published this document, eg, '1970 to 1974 Conservative government’
latest_change_note
unsearchable_text
Note indicating what changed in the last major revision.