At 14:37 UTC on October 17, 2024, our production monitoring stack lit up with 12,400 500 Internal Server Errors in 60 seconds, impacting 20,000 active users across 14 sharded MongoDB 9.0 clusters. The root cause wasn’t a bad deployment, a network par...
blog.johal.in17 min readNo responses yet.