High Nginx 5xx rate
You can view the 5xx logs across all machines on this dashboard:
Change the hostname to view different apps.
The alert should link to a graphite graph - often certain applications such as Whitehall can have spikes - if you can determine this is a spike it is best to acknowledge the alert and let a team that is working on the app know (or alert Platform Reliability).
Multiple applications reporting errors
If multiple applications are reporting 5xx errors, there is likely to be a common cause. The first thing to check is whether content store is erroring. If content-store is not reporting any errors, but the dependent frontend apps are (see the bottom of the dashboard), it could be that the ARP cache needs to be flushed.
Sometimes a high 5xx rate can be because of a sudden increase in traffic to the site. You can use the Nginx Requests (AWS) dashboard to see if there are an unusually high number of requests to a particular machine class. If there are, you may want to consider scaling up the number of machines available to handle the requests.