I inherited a GitLab CI pipeline that ran fine locally but dropped test reports in production. The issue: we were running tests in parallel across 10 runners, but the artifacts:reports:junit section only collected from the first job that finished.
The problem was in our .gitlab-ci.yml:
test:
parallel: 10
artifacts:
reports:
junit: coverage/junit.xml
When you parallelize without explicit job naming, GitLab overwrites artifacts from previous parallel jobs. We'd lose 90% of test failures because only the last completed job's report survived.
The fix required two changes. First, use artifacts:name with a unique identifier and artifacts:paths to capture all reports:
test:
parallel: 10
artifacts:
name: "junit-$CI_NODE_INDEX"
paths:
- coverage/junit-*.xml
reports:
junit: coverage/junit-*.xml
Second, we reconfigured our test runner to output reports with numeric suffixes based on CI_NODE_INDEX.
What I learned: parallel CI jobs need explicit artifact handling. Don't assume glob patterns work the same way in CI as they do locally. Test your artifact collection separately from your test execution. We added a dedicated validation step that counts expected vs. actual report files before proceeding.
This caught a similar issue in our data pipeline tests three months later. Worth the investment.
Sophia Devy
I’m Sophia Devy Full-Stack Developer
Ah yeah, this is a classic parallel execution gotcha. The artifact path needs to be unique per job. What actually worked for us was using ${CI_NODE_INDEX} to namespace the output:
test:
parallel: 10
artifacts:
reports:
junit: coverage/junit-${CI_NODE_INDEX}.xml
Then in your test runner, make sure each parallel job writes to its own file. We also added a post-processing step that merges all junit xmls before uploading, since some tools expect a single report.
The silent failure part is brutal though. We caught ours by accident during a sprint review when test coverage metrics looked too good.
Yeah, that's a classic gotcha with parallel jobs. The fix is using the ${CI_NODE_INDEX} variable to give each parallel job its own artifact path:
test:
parallel: 10
artifacts:
reports:
junit: coverage/junit-${CI_NODE_INDEX}.xml
Then GitLab collects all of them. We hit this exact issue in production last year. The silent failure part is the worst, you just think your test coverage mysteriously dropped 50%.
Worth checking if you're also losing coverage reports the same way. If you're using multiple report formats, each needs its own naming strategy.
Yeah, this is a footgun. The parallel job syntax creates N identical jobs and they all try to write to the same artifact path, so later ones overwrite earlier ones. GitLab should warn about this harder.
The fix is using the matrix strategy or explicit job names with different artifact paths:
test:
parallel:
matrix:
- SHARD: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
artifacts:
reports:
junit: coverage/junit-${SHARD}.xml
Then use a separate job to merge them. We hit this exact issue at scale and spent a week wondering why flaky tests weren't showing up in reports. The real problem is that CI systems assume you know this, but it's not obvious until you're running it.
Yeah, that's a classic footgun. Parallel jobs need unique artifact paths or you get silent overwrites. The fix is either:
test:
parallel: 10
artifacts:
reports:
junit: coverage/junit-${CI_NODE_INDEX}.xml
Or use glob patterns if your test runner already splits output. We hit this exact issue with RN test suites last year. The worst part is it looks like everything passed since each job individually succeeds. Worth adding a sanity check to your pipeline that validates the number of test files matched the number of jobs.
DevOps engineer. Terraform and K8s all day.
This is such a real CI footgun parallelization makes everything look “green” while you’re quietly losing signal. The root cause you described is a great reminder that artifact handling has to be treated like first-class pipeline logic, not an afterthought: if jobs shard, then reports must shard too (e.g., naming by CI_NODE_INDEX or a matrix variable) or they’ll overwrite each other.
I also really like the idea of adding a validation step that checks “expected vs. actual report files” before proceeding simple guardrails like that prevent weeks of false confidence in test health.