Ali Vijdaanvijdaancoding.hashnode.dev·Jun 22, 2024Never Struggle with Skewed Data AgainThe article will give a through explanation on what skewed data is, why its bad for your models and how you can identify and fix such data. What is Skewed Data Data which is normally distributed would look like this As you can see the mean, median a...Discussskewed datasets
Sanika Nandpuresanikanandpure.hashnode.dev·May 19, 2024the problem with skewed datasetsA skewed dataset is a dataset where the outputs are not evenly split. In other words, if our dataset has 10 training examples, 8 of them have an expected output y-hat of 1 whereas only 2 of them have an expected output of 0. In this case, the error m...Discussskewed datasets
Ajay Veerabommaajayveerabomma.hashnode.dev·Feb 20, 2024Understanding Salting Technique in SparkApache Spark is a powerful open-source distributed computing system known for its speed, ease of use, and sophisticated analytics capabilities. It is widely used for big data processing and analytics due to its ability to handle large-scale data acro...Discuss·2 likes·88 readsspark