Spark Performance Optimization - Data Skew
What is data skew in Spark ?
Data skewness in Apache Spark refers to a condition where the data being processed is not distributed evenly across partitions. In an ideal scenario, data should be uniformly distributed across all the partitions to ensur...
mpmartydata.hashnode.dev7 min read