Skip to main content
Table of contents

Monitoring

Debug underperforming search

Search is one of the more load-sensitive parts of GOV.UK, as it can’t be cached as effectively as more static pages. There are two significant components involved in search: the search-api application, and the AWS-managed Elasticsearch cluster powering it.

Useful metrics to look at are:

  • Request duration from finder-frontend to search-api, on the finder-frontend app dashboard

    If this has increased then there may be a capacity issue with search-api.

  • Request duration from search-api to Elasticsearch and SageMaker, on the search-api app dashboard

    See the “<thing> req count vs latency” graphs:

    • Reranker: if this has increased, queries sorted by relevance (keyword searches) will be slower. This could indicate a performance issue with SageMaker.
    • Search: if this has increased, all queries will be slower. This could indicate a performance issue with Elasticsearch.
    • Spelling suggestion: if this has increased, finder-frontend pages will be slower. Other search-powered pages, like taxon pages, would not be affected. This could indicate a performance issue with Elasticsearch.
  • The machine dashboard for search.

  • The AWS dashboard for Elasticsearch in the AWS console.

    There are a lot of metrics here. A capacity issue could be suggested by the “Index thread pool” or “Search thread pool” graphs being consistently above the red dashed line, which means that requests are queueing. Talk to RE in that case.

  • The AWS dashboard for SageMaker in the AWS console.

    A capacity issue could be suggested by the CPU utilisation graph being constantly close to 100%.

This page was last reviewed on 19 May 2020. It needs to be reviewed again on 19 August 2020 by the page owner #govuk-searchandnav .
This page was set to be reviewed before 19 August 2020 by the page owner #govuk-searchandnav. This might mean the content is out of date.