Akshay Siwalakshay-siwal.hashnode.dev·Nov 20, 2024Understanding Slack's January 2021 Outage: A Restaurant AnalogyOn January 4th, 2021, Slack experienced a significant outage that affected millions of users worldwide. While the root cause was traced to degraded AWS Transit Gateways (network connectivity issues), an interesting cascade of events occurred when Sla...slack
Christopher Chilengwechristopherchilengwe.hashnode.dev·Aug 12, 2024Postmortem ReportSummary On August 12th, 2024 at midnight the server access went down resulting in 504 error for anyone trying to access a website. Background on the server being based on a LAMP stack. Timeline 00:00 PST - 500 error for anyone trying to access the we...postmortem on web infrastructure
Oruma Gideongideonoruma.hashnode.dev·Jan 19, 2024Postmortem: The Case of the Rebellious Query**Postmortem: The Case of the Rebellious Query** Executive Summary: Duration: 4 hours (14:00 to 18:00 UTC) Impact: 30% of users experienced slow response times and intermittent disruptions on our e-commerce platform. Root Cause: A rogue S QL que...Alx
Samuel Ogboyesamuelogboye.hashnode.dev·Nov 6, 2023Postmortem Report: E-commerce Website Outage - A Rollercoaster Ride in the Digital RealmIntroduction: A postmortem report, also known as a post-mortem analysis or simply a postmortem, is a document that provides a detailed and systematic review of a project, event, process, or situation after it has concluded or failed. It is used to as...1 like·50 readspostmortem
Idris Yakubdriiisdev.hashnode.dev·Aug 23, 2023Typical Postmortem Report : Web Application Service DowntimeIssue Summary: Outage Duration: August 15, 2023, 10:45 AM - August 15, 2023, 12:30 PM (UTC) Impact: Web Application Service Downtime Users Affected: Approximately 25% of users experienced service disruption, leading to slow loading times and intermit...postmortem
chuk'ssomzzysomzzy.hashnode.dev·Aug 13, 2023PostmortemIn software engineering system failure is a way to build a robust system and a core part of learning from the failure. This blog post would outline a system failure that I recently experienced and step taken to fix the errors. Issue Summary On Tuesda...postmortem on web infrastructure
Evans Muuoevansmuuo.hashnode.dev·Jun 16, 2023PostmortemIssue Summary: Duration: June 14, 2023, 10:00 AM - June 14, 2023, 11:30 AM (UTC) Impact: The authentication service was down, resulting in users being unable to log in to the platform. Approximately 30% of users were affected by this issue. Root Caus...postmortem on web infrastructure
Ikenna Udemezueiykethe1st.hashnode.dev·May 14, 2023Postmortem: E-commerce Website Outage Caused by Database Misconfiguration (for documentation purposes only)Issue Summary: On May 13, 2023, from 8:00 PM to 12:30 AM PST, our e-commerce website experienced a complete outage, which resulted in users being unable to access any pages on the website. All of our users were affected by this issue. Timeline: 8:00...31 readspostmortem on web infrastructure
Durosinlohun Uthmandruth.hashnode.dev·May 14, 2023Postmortem AnalysisThis post-mortem report details the incident that occurred on the 14th day of April 2023, the incident affected the web server and resulted in the outage of our website. Incident Summary The incident was reported between 2:30 PM WAT and 3:00 PM WAT. ...1 like·27 readsGeneral Programming
Samuel Chinonso Archibongblackisking.hashnode.dev·May 13, 2023My first postmortem (Incident Report)Any software system will eventually fail, and that failure can stem from a wide range of possible factors: bugs, traffic spikes, security issues, hardware failures, natural disasters, human error… Failing is normal and failing is actually a great opp...postmortem