Datadog monitors for slow SQL queries on MySQL
Pingdom makes sure the servers are responding
Redundancy is provided by Rackspace monitoring for both the dedicated DB server as well as the load balancer and servers
Then, I run a custom cron job on each server every minute that sends the servers vitals to Firebase (attached) that I watch during the day.
We're registered on the status page of our 3rd party services (Authorize.net, Paypal, etc...) so we get an email if something is wrong with them.
Goaccess.io runs on each server and is open in terminal all day to show me real time traffic from top IP addresses
And we all have a monitor hanging on the wall in each of our offices to show us realtime Google Analytics.
It's basically impossible for the site to have a problem without someone noticing it within 1 minute :)
