May 15th Outage: Post Mortem

Thursday, May 15th, at approximately 3:30pm Mountain Time, our customers experienced 45 minutes of unexpected downtime. The root cause of the problem was a misconfigured load balancer. A planned maintenance event increased the traffic load for a short period of time causing the misconfigured load balancer to behave poorly and eventually fail. This resulted in a cascading event as each load balancer crashed when traffic was diverted to the next available load balancer.

We worked with our hosting provider to adjust the configuration and to add additional monitoring and watch processes. We don’t expect this problem to occur again in the future and apologize for any inconvenience this outage may have caused.


Originally published here.

Originally published by: Mark Michael.

Subscribe to LiteTracker

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe