Comment by Priya Sharma on "We shipped a goroutine leak in production and lost a day tracking it down"

Classic. The thing that bit us hard was that our monitoring didn't catch it early enough. We had memory alerts but they were tuned for normal growth patterns, so a gradual leak looked like noise.

What actually helped: we started using pprof goroutine dumps in staging under realistic load, then ran them through a baseline comparison tool before deploys. Caught two more leaks that way before they hit prod.

Your example is the standard footgun - that goroutine ignores context entirely. Even if you propagate ctx, you need a select on <-ctx.Done() inside that closure, and a timeout as a backstop. We switched to a pattern where spawning goroutines goes through a wrapper that enforces both.

The other thing: if you're caching results from slow calls, consider whether you actually need fire-and-forget. Often a sync call with backpressure is safer than "just spawn it".

Search Hashnode