Table of contents

How logging works on GOV.UK

Logging Source diagram.

Logit

GOV.UK is following The GDS Way guidance on logging by using the approved vendor Logit.

For information on how to log in and view stacks, please see the GOV.UK Logit documentation.

Elasticsearch

You can access the credentials for the Elasticsearch instances in Logit using the logit key in the govuk-secrets 2nd Line password store.

Filebeat

Each machine runs Elastic Filebeat, and independently ships logs to the Logit-provided logstash endpoint.

Filebeat tails logs every 10 seconds and can output to a variety of sources. It is fully incorporated into the Elastic ecosystem.

We use the filebeat::prospector defined type to create the filebeat configuration on each instance.

Logstream and Logship

We have a defined type in our Puppet code which uses logship to tail logfiles. We only use Logstream to send nginx metrics, via statsd, to Graphite.

In the future this will be replaced.

Kibana

Kibana is the interface for viewing logs in Elasticsearch. Use the Logit interface to login to Kibana.

There’s some documentation on useful Kibana queries for 2nd line.

Fastly

Fastly sends logs to multiple locations for the www, assets and bouncer services:

  • via syslog to the logs-cdn-1 boxes in all environments (/mnt/logs_cdn), available immediately
  • to an S3 bucket per environment, available every 10 minutes

Graphite

Data from statsd goes to Graphite instances which is then displayed using Grafana.

Analytics through Athena

This documentation is adapted from the Email Alert API analytics documentation which also uses Athena

Athena is Amazon’s service for querying files in S3 using an SQL-like syntax. The logs bucket is set up to be crawled by AWS Glue every day in order to discover new partitions and configure the schema correctly.

Athena is accessible through the AWS control panel by following the instructions for accessing the AWS console. To access the production data you will need to use the govuk-infrastructure-production account, once there you can head to athena and select the fastly_logs database.

There are 16 fields to query, plus the year, month and date as partitions. It is vastly cheaper and faster to query with partitions.

Querying Athena is done through a SQL dialect provided by presto - query documentation is available.

Always query with partitions

You should always query with a where condition which defines the partitions to be used in your result set e.g. WHERE year=2018 AND month=7 AND date=4 unless you are sure you need a wider data range.

The data is stored in directories which separate the data by year, month and date values. By applying a partition to the query, such as WHERE year=2018 AND month=7 AND date=4 you reduce the data needed to be traversed in the query to just the files from that single day. Which naturally makes the query perform substantially quicker.

Each query against Athena has a monetary cost - at time of writing $5 per TB of data scanned - and by using partitions you massively reduce the data that needs to be scanned.

Example queries

Number of requests per TLS version

SELECT "tls_client_protocol", count(*)
FROM "fastly_logs"."govuk_www"
WHERE "year" = 2018 AND "month" = 8 AND "date" = 5
GROUP BY "tls_client_protocol";

Number of errors returned during an incident

SELECT "status", count(*)
FROM "fastly_logs"."govuk_www"
WHERE "year" = 2018 AND "month" = 8 AND "date" = 5
AND "request_received"
  BETWEEN timestamp '2018-08-05 10:00:00'
  AND timestamp '2018-08-05 11:00:00'
GROUP BY "status";
This page was last reviewed . It needs to be reviewed again by the page owner #govuk-2ndline.