Been running a gRPC service on Lambda for 18 months. Cold starts were killing us - median 3.2s, p99 hitting 8s. Customers noticed. We tried the obvious stuff first (provisioned concurrency, bigger memory) but that was expensive and didn't fully solve it.
What actually helped: moving initialization code out of the handler entirely. I was creating database connections, gRPC stubs, and loading config on every invocation. Moved all of that to module-level code that runs once during initialization.
# before: slow
def lambda_handler(event, context):
conn = psycopg2.connect(...)
stub = grpc_stub(channel)
...
# after: connection lives outside handler
_conn = psycopg2.connect(...)
_stub = grpc_stub(channel)
def lambda_handler(event, context):
# reuses _conn and _stub
...
Dropped cold starts from 3.2s to 800ms. Still not ideal but acceptable for our sla. The connection pooling wasn't even the bottleneck - it was the repeated TLS handshakes and gRPC channel setup.
Real talk: if you're hitting this problem at scale, serverless might just be the wrong architecture for your workload. We've talked about moving back to containers. At least with ECS you know what you're paying for.
totally valid point. though for most cases, environment variables + lazy loading in handlers keeps things cleaner. only grab what you need, when you need it. secrets manager overkill unless you're rotating frequently.
That's the real lesson nobody emphasizes enough. Cold starts aren't actually about Lambda being slow - they're about how you structure initialization.
Your experience matches mine. I moved a Rust Lambda function from creating connections per-request to using lazy_static for DB pools. Went from 2.1s cold start to 280ms. The runtime itself wasn't the bottleneck, initialization was.
Provisioned concurrency is just band-aid over poor design. You're paying for always-on instances when the real problem is doing expensive work in handler scope.
The tradeoff: module-level initialization means more careful state management and harder testing. But that's a better problem than customers timing out.
That's the critical insight most people miss. Cold start pain is usually a symptom of doing too much work at invocation time rather than a platform limitation.
What you did (module-level initialization) is exactly right, but the real lesson is that serverless forces you to think about initialization differently than traditional servers. Your gRPC stubs and DB connections should live outside the handler scope anyway for connection pooling.
The expensive solutions (provisioned concurrency, upsizing) are band-aids. They work but you're paying for idle capacity. Better approach: profile what's actually slow in your cold path, separate concerns, and make sure expensive resources are initialized once per container lifecycle, not per request.
That said, if your p99 is still hitting seconds after optimization, serverless might genuinely be the wrong tool for that workload. Sometimes it's worth switching to ECS or K8s where cold starts aren't a factor at all.
Been there. Cold starts are real, but the actual problem is usually what you discovered: wasteful initialization. Module-level init is the right move.
That said, 3.2s median suggests something else was happening too. If you're still on Lambda, consider whether gRPC over HTTP/2 is the right choice here. We ditched gRPC for internal services and moved to Postgres subscriptions for async work. Eliminated a whole category of connection overhead.
For stateful services like yours, Lambda was probably the wrong tool. Did you evaluate moving to ECS or even just a box.
Yeah, this is the real talk nobody mentions in the marketing material. I've seen the same pattern with Go services on Lambda. The module-level init thing works, but honestly, we just migrated off Lambda for anything latency-sensitive.
The cost math on provisioned concurrency eats into your savings anyway. For us, ECS Fargate with auto-scaling gave way more predictable performance at a lower total cost. Cold starts are a Lambda tax that never fully goes away.
Your approach is solid though. If you're stuck with Lambda, that's the right move.
Chloe Dumont
Security engineer. AppSec and pen testing.
Yeah, this is real. The module-level initialization trick is solid, but watch your security surface. I've seen teams load database credentials, API keys, all of it at init time and suddenly you've got secrets sitting in memory across invocations.
Use something like AWS Secrets Manager with caching, not hardcoded env vars. Also, if you're doing gRPC stubs at init, make sure connection pooling doesn't leak across requests in unexpected ways. That's where bugs hide.
Provisioned concurrency is expensive but sometimes cheaper than the operational debt of customer complaints. Worth doing the math on your actual traffic patterns.