Why did our CI/CD pipeline silently drop half our test results?

I inherited a GitLab CI pipeline that ran fine locally but dropped test reports in production. The issue: we were running tests in parallel across 10 runners, but the artifacts:reports:junit section only collected from the first job that finished.

The problem was in our .gitlab-ci.yml:

test:
  parallel: 10
  artifacts:
    reports:
      junit: coverage/junit.xml

When you parallelize without explicit job naming, GitLab overwrites artifacts from previous parallel jobs. We'd lose 90% of test failures because only the last completed job's report survived.

The fix required two changes. First, use artifacts:name with a unique identifier and artifacts:paths to capture all reports:

test:
  parallel: 10
  artifacts:
    name: "junit-$CI_NODE_INDEX"
    paths:
      - coverage/junit-*.xml
    reports:
      junit: coverage/junit-*.xml

Second, we reconfigured our test runner to output reports with numeric suffixes based on CI_NODE_INDEX.

What I learned: parallel CI jobs need explicit artifact handling. Don't assume glob patterns work the same way in CI as they do locally. Test your artifact collection separately from your test execution. We added a dedicated validation step that counts expected vs. actual report files before proceeding.

This caught a similar issue in our data pipeline tests three months later. Worth the investment.

Ah yeah, this is a classic parallel execution gotcha. The artifact path needs to be unique per job. What actually worked for us was using ${CI_NODE_INDEX} to namespace the output:

test:
  parallel: 10
  artifacts:
    reports:
      junit: coverage/junit-${CI_NODE_INDEX}.xml

Then in your test runner, make sure each parallel job writes to its own file. We also added a post-processing step that merges all junit xmls before uploading, since some tools expect a single report.

The silent failure part is brutal though. We caught ours by accident during a sprint review when test coverage metrics looked too good.

solid approach. alternatively you can use ${CI_JOB_ID} if you need globally unique paths across pipeline runs, though ${CI_NODE_INDEX} is cleaner for parallel matrix jobs since it's predictable and easier to aggregate later.

solid tip. we do similar with github actions matrix builds but use ${{ matrix.node-index }} in the artifact name. the key is making sure your test runner actually respects that env var when writing output, otherwise you're just shuffling the same filename around.

Yeah, that's a classic gotcha with parallel jobs. The fix is using the ${CI_NODE_INDEX} variable to give each parallel job its own artifact path:

test:
  parallel: 10
  artifacts:
    reports:
      junit: coverage/junit-${CI_NODE_INDEX}.xml

Then GitLab collects all of them. We hit this exact issue in production last year. The silent failure part is the worst, you just think your test coverage mysteriously dropped 50%.

Worth checking if you're also losing coverage reports the same way. If you're using multiple report formats, each needs its own naming strategy.

Yeah, this is a footgun. The parallel job syntax creates N identical jobs and they all try to write to the same artifact path, so later ones overwrite earlier ones. GitLab should warn about this harder.

The fix is using the matrix strategy or explicit job names with different artifact paths:

test:
  parallel:
    matrix:
      - SHARD: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
  artifacts:
    reports:
      junit: coverage/junit-${SHARD}.xml

Then use a separate job to merge them. We hit this exact issue at scale and spent a week wondering why flaky tests weren't showing up in reports. The real problem is that CI systems assume you know this, but it's not obvious until you're running it.

yeah, matrix is the way. though honestly i'd just use different artifact paths with explicit job names - simpler to reason about and doesn't hide what's actually happening under fancy syntax.

yeah, the matrix strategy is definitely the way. running N parallel jobs to the same artifact path is a recipe for silent data loss. matrix at least makes the dependency graph explicit and lets you control naming per variant.

Yeah, that's a classic footgun. Parallel jobs need unique artifact paths or you get silent overwrites. The fix is either:

test:
  parallel: 10
  artifacts:
    reports:
      junit: coverage/junit-${CI_NODE_INDEX}.xml

Or use glob patterns if your test runner already splits output. We hit this exact issue with RN test suites last year. The worst part is it looks like everything passed since each job individually succeeds. Worth adding a sanity check to your pipeline that validates the number of test files matched the number of jobs.

Thread

Why did our CI/CD pipeline silently drop half our test results?

Responses(8)

Recent threads

Search Hashnode

Why did our CI/CD pipeline silently drop half our test results?

Responses(8)

Recent threads