devopsofworld.comZero-Downtime Migration from NGINX Ingress to Gateway API on Amazon EKS (Production Case Study)A Zero-Downtime, Step-by-Step Implementation Guide 1. Overview In this post, we walk through a real production migration of a Kubernetes workload from NGINX Ingress Controller to Kubernetes Gateway API, implemented using Envoy Gateway, on Amazon EKS...3d ago·6 min read
devopsofworld.comProduction Incident: Node.js Application Did Not Start After Server Reboot (PM2 + systemd Fix)Context We were running a Node.js backend using PM2 on a Linux server. Application details: Process manager: PM2 Mode: fork User: root Deployment: Manual setup on VM No containerization No autoscaling The service was running fine in steady st...4d ago·4 min read
devopsofworld.comDeploying Apache Superset on Kubernetes (Helm): From Chaos to ProductionIntroduction Deploying Apache Superset on Kubernetes using the official Helm chart appears straightforward when following the documentation. In real-world environments, however, production deployments often expose issues across multiple layers — Helm...5d ago·6 min read
devopsofworld.comKubernetes Outage Postmortem: Nodes Stuck in NotReady Due to CNI FailureRecently, we encountered a critical production outage in our Kubernetes cluster. New nodes provisioned during autoscaling remained in a NotReady state, leading to service disruptions and failed health checks across workloads. In this post, I’ll walk ...May 28, 2025·3 min read
devopsofworld.comHow to Set Up Disaster Recovery (DR) for AWS MSK with MirrorMaker 2 – Step-by-Step GuideIn today's cloud-native world, ensuring high availability and resilience for streaming platforms like Apache Kafka is mission-critical. Amazon MSK (Managed Streaming for Apache Kafka) offers a powerful, fully managed Kafka service. However, it doesn'...May 27, 2025·5 min read