Classic. The thing that bit us hard was that our monitoring didn't catch it early enough. We had memory alerts but they were tuned for normal growth patterns, so a gradual leak looked like noise.
What actually helped: we started using pprof goroutine dumps in staging under realistic load, then ran them through a baseline comparison tool before deploys. Caught two more leaks that way before they hit prod.
Your example is the standard footgun - that goroutine ignores context entirely. Even if you propagate ctx, you need a select on <-ctx.Done() inside that closure, and a timeout as a backstop. We switched to a pattern where spawning goroutines goes through a wrapper that enforces both.
The other thing: if you're caching results from slow calls, consider whether you actually need fire-and-forget. Often a sync call with backpressure is safer than "just spawn it".