How logging works on GOV.UK
To view logs, see View GOV.UK logs in Logit.
Overview
GOV.UK sends its application logs and origin HTTP request logs to managed ELK stacks hosted by Logit.io, a software-as-a-service provider. Each environment has its own ELK stack in Logit.
Fastly CDN request logs use a different system to this because of the much higher data rates involved.
Components in the logging path
Application container
The application container, or a sidecar/adapter container such as the nginx reverse proxy, writes log lines to stdout or stderr.
Log lines can be structured (JSON) or unstructured (arbitrary text).
Application workloads in GOV.UK’s Kubernetes clusters run with
readOnlyRootFilesystem: true
and should not write logs anywhere other than stdout/stderr. Only logs written to stdout/stderr will be collected by the logging system.
Kubernetes worker node
The container runtime (containerd
) and
kubelet
on the worker node are responsible for:
- writing the workload containers’ stdout/stderr streams to files in
/var/log/containers
- rotating those log files so as not to run out of space
Filebeat
The Filebeat daemon runs on each Kubernetes worker node and is responsible for:
- finding and tailing log files on the local filesystem
- applying some transformations such as parsing JSON-formatted log lines and dropping some unwanted fields
- sending logs to Logstash (at Logit)
Filebeat runs as a
daemonset
in the cluster-services
namespace.
As of mid-2023, Filebeat is installed by a Helm chart via Terraform and its configuration is also managed using Terraform. In the future, Argo CD could replace this usage of Terraform as part of ongoing efforts to reduce toil.
Logstash
Logstash is a log ingestion pipeline, essentially an ETL system for logs. Logstash and the remaining components in the logging path are hosted by Logit.io.
Logstash is responsible for:
- receiving streams of log messages from each node’s Filebeat process via TLS over the Internet
- parsing the semi-structured log messages from Filebeat and transforming them where necessary, for example to fit the Elastic Common Schema
- loading the logs into Elasticsearch for storage and indexing
Elasticsearch
Elasticsearch is a search engine and storage/retrieval system. It is responsible for:
- storing the log data
- indexing the stored logs for efficient search and retrieval
- running queries and returning the results to the user interface (Kibana)
Kibana
Kibana is the user interface for viewing logs. It is responsible for:
- rendering the web UI
- parsing user queries written in Lucene/KQL into Elastic
- querying the Elasticsearch indices
- displaying the results
Vulnerability Scanners
Occasionally you may see some logs that look suspicious. For example, see the controller
part of the following made up example:
http_request_duration_seconds_count { action="1", container="app", controller="../../../../../etc/passwd", endpoint="metrics", job="govuk", namespace="apps", pod="static-abc123-de7f" }
This sort of thing often originates from vulnerability scanners and isn’t necessarily something to worry about. However, you should consider how such activity ended up leaking into your logs, and whether or not there’s anything you could/should do to patch it up. Read our internal guidance on the issue.