Apache Spark Aggregation methods
two primary methods of performing aggregations:
Sort-based and
Hash-based,
both optimized for different scenarios and have distinct performance characteristics
HashAggregateSortAggregate
Faster because it avoids sorting dataSorting data ba...
pikopira54.hashnode.dev5 min read