Kubernetes GPU Sharing: NVIDIA MIG + DRA on Amazon EKS
GPU scheduling in Kubernetes has always felt like buying a mansion when you need a studio apartment. A small inference workload that needs 2GB of GPU memory gets scheduled on an entire 80GB A100, and there's nothing you can do about it. The device pl...
blog.ediri.io29 min read