Spark RDD Cheat Sheet with Scala
A cheat sheet on spark RDD operations with scala
The main abstraction Spark provides is a resilient distributed dataset (RDD), which is a collection of elements partitioned across the nodes of the cluster that can be operated on in parallel. RDDs a...
mouhammad.hashnode.dev6 min read
Owen Magumise
dataengineer
Great stuff