Prometheus Metrics for Batch Jobs on Kubernetes
Introduction
PathAI’s core engineering work is built around scheduling, running, tracking, and otherwise managing compute jobs, usually as part of a machine learning pipeline. For Machine Learning Scientists and Engineers, the ability to experiment w...
pathai.hashnode.dev8 min read