Table of contents
This page describes what to do in case of an Icinga alert. For more information you could search the govuk-puppet repo for the source of the alert

Nginx 429 too many requests

Nginx is configured with a rate limit in the cache servers to stop unusual load from being redirected to the rest of the systems. Sometimes a robot or a malicious client could hit the cache servers above the rate limit, in which case Nginx will reject requests, log an error message in the error log file and generate an HTTP 429 response code.

The alert should link to a Graphite graph, if you can determine this is a spike it is best to check the Nginx logs in Kibana to determine why Nginx is rejecting requests (for instance, too many requests coming from a single IP, or the same page being requested at a high rate).

If the message is “UNKNOWN: INTERNAL ERROR: RuntimeError: no valid datapoints” or “UNKNOWN: INTERNAL ERROR: RuntimeError: no data returned for target”, it probably means that statsd or collectd stopped submitting data for a period. Statsd metrics (those that begin with stats.) don’t get created until the first event of a given type. For this specific HTTP 429 error, the metric may never get created. You can force creation by creating a zero-value http_429 counter:

fab $environment -H cache-1.router statsd.create_counter:cache-1_router.nginx_logs.assets-origin.http_429
fab $environment -H cache-1.router statsd.create_counter:cache-1_router.nginx_logs.www-origin.http_429
This page is owned by #2ndline and needs to be reviewed