@aditmodi24
Solution Architect | 12x AWS Certified
I’m a Solution Architect at Lauren, AWS UG Vadodara Co-Organizer and HashiCorp Ambassador
Public speaking, podcasts, guest blog posts, and consulting in the Cloud/DevOps/BigData space.
1d ago · 18 min read · From integer counting to structured resources — how Dynamic Resource Allocation and the AI Cluster Readiness framework finally make GPU infrastructure manageable at scale. Contents The Two Nightmares
Join discussion
2d ago · 16 min read · Dear EKS & AI Infrastructure enthusiasts,Welcome to Everything about EKS & AI Infrastructure #62. There’s something uncomfortable happening in AI infrastructure right now that nobody says out loud: th
Join discussion
Mar 28 · 27 min read · Your vLLM cluster has a problem you probably don't know about. It's not a bug. Nothing is crashing. The metrics dashboard looks fine. But right now, every time a request hits your load balancer, there
Join discussion
Mar 28 · 30 min read · There's a class of production incident that doesn't page anyone. No error rate spikes. No latency alert fires. The cluster health dashboard shows green. GPU nodes are online. Pods are running. And yet
Join discussion