MHMuhammad Hassaan Javedinblog.infraforge.agency·Jun 17 · 12 min readWhen a hardening rollout breaks 8 layers and your own reconciler fights youThe first thing the on-call team tried was patching the status ConfigMap. Five apps showed Progressing in the platform's bleater-status object, the dashboard had been red for four hours, and somebody 00
MHMuhammad Hassaan Javedinblog.infraforge.agency·Jun 16 · 11 min readWhen a validating webhook blocks the ConfigMap that would fix itThe kubectl patch came back as a webhook timeout, not a credentials error. That was the moment the incident stopped being about a rotated MongoDB password and started being about the admission layer. 00
MHMuhammad Hassaan Javedinblog.infraforge.agency·Jun 2 · 10 min readArgoCD CVE-2022-24348: a Secret leak that hid in log volumeThe first thing we saw in Loki was a fanout service log line that contained the string 'a2V5Y2xvYWstY2xpZW50' repeated about 40 times in a single minute. Base64 decode: 'keycloak-client'. The fanout s00
MHMuhammad Hassaan Javedinblog.infraforge.agency·Jun 1 · 10 min readWhy Grafana OnCall acknowledgments hang after a Helm upgrade migrationThe call did not come from our on-call rotation. It came from a customer who noticed two unrelated degradations on their side and asked why we had not paged. We had not paged because Grafana OnCall ha00
MHMuhammad Hassaan Javedinblog.infraforge.agency·May 30 · 9 min readWhy a deleted backup Lambda kept billing 9,400 EBS snapshotsThe EBS Snapshot line on the monthly bill was $1,830. There was no active EBS snapshot policy on the account. The backup Lambda that had produced these snapshots had been deleted thirteen months earli10