Been there with Lambda concurrency limits. The pattern is identical, just different runtime. You spin up execution contexts without bounds and suddenly you're throttled or OOM.
Worker pool is the right fix. Alternatively, if you control the Kafka consumer, adjust fetch size and parallelism at that layer instead of per-message. That's where I'd start - backpressure at the source beats cleanup downstream.
The real lesson: measure before scaling. 100 msgs/sec hides everything.