Spark Adaptive Query Execution: Dynamic Coalescing, Pruning, and Skew Handling
TLDR: Before AQE, Spark compiled your entire query into a static physical plan using size estimates that were frequently wrong — and a wrong estimate at planning time meant a skewed join, 800 small tasks, or a missed broadcast opportunity that no amo...
abstractalgorithms.dev34 min read