Table of contents

Search API field reference

Documents in an Elasticsearch index have a type, and each type can have different fields.

In Search API, the field is named elasticsearch_type to avoid confusion with content_store_document_type.

All fields

all_searchable_text
all_searchable_text_type
Special field which searchable text is copied into; similar to the standard `_all` field, but more customisable
exact_query
best_bet_exact_match_text
Field used in the best-bet implementation to perform an exact match between a user’s query and a stored best-bet
stemmed_query
best_bet_stemmed_match_text
Field used in the best-bet implementation to perform a stemmed match between a user’s query and a stored best-bet
has_act_paper
boolean
Used for official document status filter on publication page (publication formats only).
has_command_paper
boolean
Used for official document status filter on publication page (publication formats only).
has_official_document
boolean
Used for official document status filter on publication page (publication formats only).
important_to_policy
boolean
A flag set by editors to mark a document as being important to policy
is_historic
boolean
If the content is political and published by a previous government, it is considered historic and not reflecting of the current government
is_political
boolean
If the content is considered political in nature, reflecting views of the government it was published under
is_withdrawn
boolean
If the content has been published but then withdrawn
relevant_to_local_government
boolean
No longer used. Will be removed in a future version.
assessment_date
date
The date when the assessment was held.
closed_at
date
Closing date for closed organisations.
closed_date
date
Close date of a CMA case
closing_date
date

date_of_occurrence
date
Date of the event described in the document. Only applies to MAIB, RAIB, AAIB reports.
end_date
date
End date for topical content. Assume null means in the past for topical event pages.
first_published_at
date
The date the content was first published. Should be the same as `first_published_at` in the publishing-api.
issued_date
date

laid_date
date
The date on which a statutory instrument was laid before parliament
opened_date
date
Open date of a CMA case
public_timestamp
date
Time of the last update. Used for 1) weighting more recent editions (mainly whitehall content) more highly, 2) the default sort in finders, 3) showing a date in the search results. Should be the same as `public_updated_at` in the publishing-api.
release_timestamp
date
When statistics will be released, for statistics announcement pages.
sift_end_date
date
The date on which sifting of a statutory instrument ends
start_date
date
Start date for topical content.
tribunal_decision_decision_date
date

updated_at
date
When the page was last updated. This field is unreliable and may be deprecated in a future version.
withdrawn_date
date
The date on which a statutory instrument was withdrawn
popularity
float
Popularity indicator used to bias search results. A higher number indicates more pageviews. Updated nightly.
rank_14
float
Field used in the page-traffic index to hold the rank of this page in the list of pages on the website when ordered by traffic. This is a fairly stable value used in ranking calculations.
analytics_identifier
identifier
A unique identifier used for analytics.
content_id
identifier
The content_id of the item. This will not be present for all items, as most application do not send it.
content_purpose_document_supertype
identifier
Grouping for content performance supertypes. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
content_purpose_subgroup
identifier
Grouping for content purpose subgroups. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
content_purpose_supergroup
identifier
Grouping for content purpose supergroups. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
content_store_document_type
identifier
The document type as stored in the Content Store
detailed_format
identifier
A slugified version of the display_type field
dfid_review_status
identifier
Whether or not this item is peer reviewed (a code)
display_type
identifier
Different way of describing content_store_document_type, for some formats. May be deprecated in a future version.
email_document_supertype
identifier
High level group for email subscriptions use to identify publications and announcement. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
format
identifier
This field is less specific than content_store_document_type but is mandatory for every document. May be deprecated in future.
government_document_supertype
identifier
Grouping for email subscriptions. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
hmrc_manual_section_id
identifier

id
identifier
This field will be deprecated in a future version. Do not use.
image_url
identifier
The URL of the image associated with an edition
licence_identifier
identifier
The licence ID associated with a content item of type licence
link
identifier
The link to the document. This is usually the path component of the URL of the document, including a leading slash, but sometimes omits the leading slash, and is sometimes an absolute URL of related content which does not appear on GOV.UK
logo_formatted_title
identifier
The title (with line breaks) to be displayed alongside the organisation logo
logo_url
identifier
The url of the custom logo - applies to organisations with a custom logo
manual
identifier
Base path of the manual this document belongs to. Eg `/service-manual` or `/hmrc-internal-manuals/air-passenger-duty`
navigation_document_supertype
identifier
Navigation document type. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
operational_field
identifier
The place a fatality notice applies to.
organisation_brand
identifier
The branding (controls the colour) of the organisation logo
organisation_closed_state
identifier
Status of a closed organisation page on GOV.UK: {no_longer_exists, replaced, split, merged, changed_name, left_gov, devolved}
organisation_crest
identifier
The class name of the crest to display when rendering the organisation logo
organisation_state
identifier
Status of an organisation page on GOV.UK: {live, joining, exempt, transitioning, closed, devolved}
organisation_type
identifier
The organisation type identifier, like ‘ministerial_department’ or 'public_corporation’. Only applies to organisations. Expected value is the `organisation_type_key` from Whitehall, enumerated in https://github.com/alphagov/whitehall/blob/master/app/models/organisation_type.rb.
outcome_type
identifier
Outcome of a CMA case.
publishing_app
identifier
Application that published this page
rendering_app
identifier
Application that renders this page
search_user_need_document_supertype
identifier
Grouping for core or government documents. See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
slug
identifier
The trailing part of the link. Two slugs belonging to the same format must be unique.
statistics_announcement_state
identifier
State of a statistical announcement page; one of {cancelled, confirmed, provisional}
stemmed_query_as_term
identifier
Field used in the best-bet implementation to hold the raw form of a stemmed query
tribunal_decision_category
identifier

tribunal_decision_country
identifier

tribunal_decision_landmark
identifier

tribunal_decision_reference_number
identifier

tribunal_decision_sub_category
identifier

user_journey_document_supertype
identifier
Used to distinguish pages used mainly for navigation (finding) from content pages (thing). See https://github.com/alphagov/govuk_document_types/blob/master/data/supertypes.yml
aircraft_category
identifiers
Used for official document status filter on publication page (publication formats only).
alert_type
identifiers

case_state
identifiers
Whether a case is open or closed. Applies to CMA cases.
case_type
identifiers
Similar to report_type: applies to CMA cases.
child_organisations
identifiers
A list of organisations that are children of the organisation.
contact_groups
identifiers
Contact groups a 'contact’ page belongs to.
country
identifiers
Country associated with the page. Does not link to a page on GOV.UK. See also 'world_locations’.
development_sector
identifiers

document_collections
identifiers
Document collections this page belongs to (https://docs.publishing.service.gov.uk/document-types/document_collection.html).
document_series
identifiers

eligible_entities
identifiers

fund_state
identifiers

fund_type
identifiers

funding_amount
identifiers
Funding (per unit per year)
funding_source
identifiers

grant_type
identifiers
Countryside stewardship grant type (https://www.gov.uk/countryside-stewardship-grants).
land_use
identifiers
Countryside stewardship grant land use type (https://www.gov.uk/countryside-stewardship-grants).
location
identifiers
Location metadata. Not comparable between different formats.
mainstream_browse_page_content_ids
identifiers
As opposed to “mainstream_browse_pages”, this will include the “content_ids” of each mainstream browse pagerather than the slug. It will eventually replace “mainstream_browse_pages”.
mainstream_browse_pages
identifiers
Mainstream browse pages the page can appear in (https://www.gov.uk/browse).
market_sector
identifiers
Market sector a CMA case relates to.
medical_specialism
identifiers

organisation_content_ids
identifiers
As opposed to “organisations”, this will include the “content_ids” of each organisation rather than the slug. It will eventually replace “organisations”.
organisations
identifiers
The organisations related to this page. This field is copied from the publishing-api. Note that means different things for different formats.
parent_organisations
identifiers
A list of organisations that are parents of the organisation.
part_of_taxonomy_tree
identifiers
Any taxon tagged to the document and any of their ancestor taxons
path_components
identifiers
Field used in the page-traffic index to hold the full paths of each component of a url. eg: a document with a path of ’/foo/bar/baz’ would have the values ’/foo’, ’/foo/bar’ and ’/foo/bar/baz’ in this field. This allows all the pages under a given URL component to be identified.
people
identifiers
Links to people associated with this page (https://www.gov.uk/government/people).
policies
identifiers
Policy content related to the page (https://www.gov.uk/government/policies).
policy_areas
identifiers
Policy areas are managed in Whitehall. They’re an old grouping of policies, which we’re expecting to deprecate soon. Formally known as 'topics’.
policy_groups
identifiers
Links to policy groups (working groups) associated with this page (https://www.gov.uk/government/groups).
primary_publishing_organisation
identifiers
The organisation that published this page. This field is copied from the publishing-api. It is only populated by Whitehall.
railway_type
identifiers

report_type
identifiers
Report type for MAIB, RAIB, AAIB reports. Possible values vary by format.
search_format_types
identifiers
Filters used by publications/announcement pages (https://www.gov.uk/government/publications/). May be deprecated in a future version.
sifting_status
identifiers
The open/close/etc state of a statutory instrument
specialist_sectors
identifiers
The navigation “topics” that the document is assigned to. Nothing to do with “policy areas”
subject
identifiers
The associated top level taxon for statutory instruments
superseded_organisations
identifiers
A list of organisations that are superseded by the organisation.
superseding_organisations
identifiers
A list of organisations that supersede the organisation.
taxons
identifiers
Topics associated with the page. Used for topic pages (e.g. https://www.gov.uk/education).
therapeutic_area
identifiers
Therapeutic area (https://www.gov.uk/drug-safety-update).
tiers_or_standalone_items
identifiers
Countryside stewardship grant land tiers or standalone items (https://www.gov.uk/countryside-stewardship-grants).
topic_content_ids
identifiers
The navigation “topics” that the document is assigned to. Nothing to do with “policy areas”. As opposed to “specialist_sectors”, this will include the “content_ids” of each topic rather than the slug. It will eventually replace “specialist_sectors”.
tribunal_decision_categories
identifiers

tribunal_decision_judges
identifiers

tribunal_decision_sub_categories
identifiers

value_of_funding
identifiers

vessel_type
identifiers

world_locations
identifiers
World location associated with this page (https://www.gov.uk/world).
attachments
objects
Metadata associated with any attachments linked to this page.
metadata
opaque_object
The “metadata” field is intended for the storage of additional non-searchable document data. This allows additional information to be stored and displayed in search results without having to make changes to the schema.
dfid_document_type
searchable_identifier
The document type of the output (e.g. Book Chapter, Conference Paper)
tribunal_decision_category_name
searchable_identifier

tribunal_decision_country_name
searchable_identifier

tribunal_decision_landmark_name
searchable_identifier

tribunal_decision_sub_category_name
searchable_identifier

business_sizes
searchable_identifiers
The sizes of business served by a business finance support scheme
business_stages
searchable_identifiers
The stages of business served by a business finance support scheme
dfid_authors
searchable_identifiers
A set of author names, which aren’t selected from a predefined list and don’t repeat
dfid_theme
searchable_identifiers
The broad theme (or themes) to which the output applies
industries
searchable_identifiers
The industries served by a business finance support scheme
tribunal_decision_categories_name
searchable_identifiers

tribunal_decision_judges_name
searchable_identifiers

tribunal_decision_sub_categories_name
searchable_identifiers

types_of_support
searchable_identifiers
The types of support provided by a business finance support scheme
title
searchable_sortable_text
The page title, as diplayed in internal search results. May differ from the page title tag or top level heading.
acronym
searchable_text
Acronym associated with the page title. Used for organisation pages.
aircraft_type
searchable_text
Metadata associated with an AAIB report (https://www.gov.uk/aaib-reports/)
description
searchable_text
A descrition or summary of the page that can be displayed to users in search results.
indexable_content
searchable_text
The main chunk of text that is indexed for the page. This varies by document type/publishing app, and may not be suitable for users to read. HTML tags are stripped out.
licence_short_description
searchable_text

registration
searchable_text
Metadata associated with an AAIB report (https://www.gov.uk/aaib-reports/)
spelling_text
spelling_text
Generated field, populated with the same content as sent to the _all field, but tokenised into words, lowercased and shingled, not stemmed, etc
details
unsearchable_text
Field used in the best-bet implementation to store the modifications to be made to the query when a best-bet is matched
government_name
unsearchable_text
The name of the Government that first published this document, eg, '1970 to 1974 Conservative government’
latest_change_note
unsearchable_text
Note indicating what changed in the last major revision.