Dataframes vs Resilient Distributed Datasets In the first article, I explained Dataframes in detail, and in the second, I talked about Resilient Distributed Datasets. But what exactly sets them apart? Here's a table that summarizes the key difference...
vaishnave.page16 min read
Kolisetty Sasiram
BigData Enthusiast
I think, sort-merge join also, shuffling happens. In sort merge join, shuffle is the 1st step, then in each executor this sorting and merge operation will be performed I think. Pls correct me If I am wrong.