How logging works on GOV.UK

Home
Logging

Warning This document has not been updated for a while now. It may be out of date.

Last updated: 20 Oct 2023

To view logs, see View GOV.UK logs in Logit.

Overview

GOV.UK sends its application logs and origin HTTP request logs to managed ELK stacks hosted by Logit.io, a software-as-a-service provider. Each environment has its own ELK stack in Logit.

Fastly CDN request logs use a different system to this because of the much higher data rates involved.

edit diagram

Components in the logging path

Application container

The application container, or a sidecar/adapter container such as the nginx reverse proxy, writes log lines to stdout or stderr.

Log lines can be structured (JSON) or unstructured (arbitrary text).

Application workloads in GOV.UK’s Kubernetes clusters run with readOnlyRootFilesystem: true and should not write logs anywhere other than stdout/stderr. Only logs written to stdout/stderr will be collected by the logging system.

Kubernetes worker node

The container runtime (containerd) and kubelet on the worker node are responsible for:

writing the workload containers’ stdout/stderr streams to files in /var/log/containers
rotating those log files so as not to run out of space

Filebeat

The Filebeat daemon runs on each Kubernetes worker node and is responsible for:

finding and tailing log files on the local filesystem
applying some transformations such as parsing JSON-formatted log lines and dropping some unwanted fields
sending logs to Logstash (at Logit)

Filebeat runs as a daemonset in the cluster-services namespace.

As of mid-2023, Filebeat is installed by a Helm chart via Terraform and its configuration is also managed using Terraform. In the future, Argo CD could replace this usage of Terraform as part of ongoing efforts to reduce toil.

Logstash

Logstash is a log ingestion pipeline, essentially an ETL system for logs. Logstash and the remaining components in the logging path are hosted by Logit.io.

Logstash is responsible for:

receiving streams of log messages from each node’s Filebeat process via TLS over the Internet
parsing the semi-structured log messages from Filebeat and transforming them where necessary, for example to fit the Elastic Common Schema
loading the logs into Elasticsearch for storage and indexing

Elasticsearch

Elasticsearch is a search engine and storage/retrieval system. It is responsible for:

storing the log data
indexing the stored logs for efficient search and retrieval
running queries and returning the results to the user interface (Kibana)

Kibana

Kibana is the user interface for viewing logs. It is responsible for:

rendering the web UI
parsing user queries written in Lucene/KQL into Elastic
querying the Elasticsearch indices
displaying the results

Vulnerability Scanners

Occasionally you may see some logs that look suspicious. For example, see the controller part of the following made up example:

http_request_duration_seconds_count { action="1", container="app", controller="../../../../../etc/passwd", endpoint="metrics", job="govuk", namespace="apps", pod="static-abc123-de7f" }

This sort of thing often originates from vulnerability scanners and isn’t necessarily something to worry about. However, you should consider how such activity ended up leaking into your logs, and whether or not there’s anything you could/should do to patch it up. Read our internal guidance on the issue.