Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

@malikmudassar

Mudassar Khani

@malikmudassar·Joined November 2021

Data Analyst

About

Web Developer by profession. Data Engineer by passion.

Available for

I am available for consultancy in all kind of Web related projects.

Mudassar Khani's blogs

Lets learn Data Engineeringdataisgold.hashnode.dev5 posts

About

Web Developer by profession. Data Engineer by passion.

Available for

I am available for consultancy in all kind of Web related projects.

Mudassar Khani's blogs

Lets learn Data Engineeringdataisgold.hashnode.dev5 posts

Articles Comments5

Recently published

MKMudassar Khaniindataisgold.hashnode.dev·Aug 19, 2024 · 4 min read

How does Delta Lake enhance Databricks?

Delta Lake enhances Databricks by adding powerful features and capabilities that address many common challenges in data engineering and analytics. Specifically, Delta Lake brings improvements in data reliability, performance, and management to the Da...

MKMudassar Khaniindataisgold.hashnode.dev·Aug 13, 2024 · 5 min read

How does Databricks compare to Hadoop?

Databricks and Hadoop are both powerful platforms for processing and analyzing large datasets, but they have different architectures, capabilities, and approaches to handling big data. Here's a comparison between the two: 1. Architecture Databricks:...

MKMudassar Khaniindataisgold.hashnode.dev·Aug 13, 2024 · 3 min read

How does Spark handle big data?

Apache Spark handles big data through a combination of distributed computing, in-memory processing, and efficient data management techniques. Here's a breakdown of how Spark manages large-scale data: 1. Distributed Computing: Cluster Management: Spa...

MKMudassar Khaniindataisgold.hashnode.dev·Aug 12, 2024 · 3 min read

What is Apache Spark?

Apache Spark is an open-source distributed computing system designed for fast and efficient processing of large-scale data. It was originally developed at UC Berkeley's AMP Lab and later became one of the most widely used data processing frameworks i...

MKMudassar Khaniindataisgold.hashnode.dev·Jan 5, 2024 · 4 min read

ETL Basics with Python

Extract, Transform, Load (ETL) is a crucial process in the realm of data engineering, allowing organizations to efficiently collect, process, and integrate data from various sources into a unified, valuable format. ETL involves extracting data from s...