Malavikaviksmals.hashnode.dev·Feb 11, 2024Running Vulnerability Scans for Spark Third Party PackagesIf you use Spark in your codebase, chances are you also use some popular third-party packages to work with Spark. What does this mean from a security perspective? Your application may have some security vulnerabilities introduced due to these third-p...Spark For Data Science
Wanjiru Njugunawanjiruh.hashnode.dev·Jun 27, 2023Introduction to Spark with ScalaImagine it's your first day of a new project in Spark, the project manager looks at your team and says to you, in this project I want you to use Scala. so let's first understand the context of spark and data processing in general and also know what S...2 likes·35 reads#apache-spark
Stephen OladeleforAI Community Africa ("AI School Africa")fearless-goat-measure-54.hashnode.dev·Jun 25, 2023Spark Machine Learning Pipelines: A Comprehensive Guide - Part 1Machine learning pipelines contain a sequence of independent and separate steps that define a machine learning workflow for solving a specific problem. The goals of a machine learning pipeline are: Improve the quality of models developed and deploye...60 readsBuilding Real-Time Machine Learning Pipelines with Apache Spark and MLeapMachine Learning
Sivaraman Arumugamsivayuvi79.hashnode.dev·Dec 25, 2022Apache Spark - Tutorial 3Here we are going to learn Spark Memory Management Before starting we need to understand the below points clearly, One core will process one partition of data at a time Spark partition is equivalent to HDFS blocks and repartition is possible One t...117 readsApache Sparkspark
Sivaraman Arumugamsivayuvi79.hashnode.dev·Dec 22, 2022Apache Spark - Tutorial 2Spark Submit From Now on we are going to use spark Submit frequently So that we are going to learn the Syntax for Spark Submit first, Once the Spark application build is completed, we use to execute that application via the spark-submit command. spar...93 readsApache SparkSpark For Data Science
Sivaraman Arumugamsivayuvi79.hashnode.dev·Dec 16, 2022Apache Spark - Tutorial 1First We need to understand why Spark? There are some Drawbacks in Hadoop MapReduce processing and then Spark came into the game. The below points can be the most listed drawbacks of MapReduce: It's only made for Batch Processing CDC - Change Data ...276 readsApache Spark#apache-spark