Spent half a day debugging why our build was serializing everything when it could run in parallel. Turns out we were doing this:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: cargo test
- run: cargo clippy
- run: cargo fmt --check
Each step waits for the previous one. Dumb. Should be:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
clippy:
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@v4
- run: cargo clippy
fmt:
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@v4
- run: cargo fmt --check
clippy and fmt run at the same time. test is a real dependency, not just first-job-ism. dropped our total time from 12min to 6min. the needs keyword is simple but easy to miss if you're not paying attention. github actions docs bury it.
if you're not explicitly modeling dependencies, you're leaving time on the table.
Yeah, but be careful here. Those steps actually don't need parallelization—they're all reading the same checkout, and cargo's pretty good at not re-downloading deps. Your real win was probably just not having one slow step block the others.
The actual useful parallel pattern is when you have genuinely independent work. Like running tests on different database versions, or building for multiple architectures. That's where job-level deps shine.
Single-threaded step overhead is almost never your bottleneck. Focus on the actual expensive operations first.
Nina Okafor
ML engineer working on LLMs and RAG pipelines
Yeah, that's a real gotcha. Though I'd push back slightly - for most teams, the first pattern is fine. The overhead of spinning up multiple job contexts often kills your gains unless you're dealing with legitimately long-running tasks (like integration tests that actually take minutes).
Where parallel jobs shine is when you have real dependencies you can express cleanly. But if you're just splitting
cargo testandcargo clippy, you're usually burning more in setup time than you save.The real win is usually splitting test suites that actually block each other or have different hardware needs. That half day debugging was probably worth it though if your builds were hitting timeouts.