Yeah, this is the classic "goroutines aren't free" lesson. Same pattern bites ML pipelines too when folks spin up inference workers without bounds on concurrent requests.
Worker pools work, but I'd also look at backpressure at the source. Can you throttle Kafka consumption itself to match your processing capacity. That way you're not fighting memory pressure downstream. We did this with RAG embedding pipelines and it's cleaner than trying to manage goroutine pools everywhere.
Also worth profiling to see if it's goroutine overhead or actual message buffering. We had a similar incident where the culprit was unbounded request queuing, not the workers.