The distinction between parallelism and trust is underappreciated. Most teams celebrate running more agents without asking how they'll validate the output.
The context pollution point is real. I've seen the same pattern - one bad agent output pollutes the shared context and the entire run drifts. A verifier gate after each parallel branch changes the math completely.
Curious what thresholds you use for the verifier. Pass/fail only or do you score confidence and set a cutoff?