We cut our LLM trace bill 30% with two sampling rules
Our observability bill for LLM traces was climbing in a straight line with request volume, and most of what we were paying to store was boring: successful calls that did exactly what they were suppose
jas-blogs.hashnode.dev2 min read