I think, sort-merge join also, shuffling happens. In sort merge join, shuffle is the 1st step, then in each executor this sorting and merge operation will be performed I think. Pls correct me If I am wrong.
Hi Kolisetty Sasiram
The shuffling does happen in the case of sort-merge join as well. I didn't include it because only if the data is not already partitioned by the join key, Spark performs a shuffle operation to redistribute the data across the cluster based on the join key. But, since I'm listing out the process, it makes more sense to add it. Thank you for your input!