@SchemaSensei

Rahul Das

@SchemaSensei

Data Engineering Wizard

Joined October 2023

About

Nothing here yet.

Available for

Nothing here yet.

Rahul Das's blogs

CodeCraft Chroniclesschemasensei.hashnode.dev6 posts

Articles Comments

Recently published

RDRahul Dasschemasensei.hashnode.devJan 7, 2025 · 9 min read

Unlocking Real-Time Data with Change Data Capture (CDC)

In this guide, we will cover CDC, its importance, and the setup of a CDC stack using Kafka, Debezium, and other services. Additionally, we will configure a PostgreSQL connector using the Confluent Control Center web UI to capture changes from a Postg...

RDRahul Dasschemasensei.hashnode.devSep 25, 2024 · 4 min read

Streaming data from Kafka to BigQuery using Apache Beam

In this guide, we will walk through the process of reading data from Kafka and storing it in BigQuery using Apache Beam. Apache Beam is a unified programming model for defining both batch and streaming data-parallel processing pipelines. Prerequisite...

RDRahul Dasschemasensei.hashnode.devSep 14, 2024 · 5 min read

Setting Up a Kafka Cluster on GCP VM Using Docker

In the world of real-time data processing and streaming, Apache Kafka stands as a robust and widely-used platform. Setting up a Kafka cluster on Google Cloud Platform (GCP) VM can be a crucial step in building your data processing pipeline. In this g...

RDRahul Dasschemasensei.hashnode.devAug 31, 2024 · 4 min read

Getting Started with PySpark

Apache Spark is a powerful distributed computing framework commonly used for big data processing, ETL (Extract, Transform, Load), and building machine learning pipelines. It supports various programming languages, including Scala, Java, and Python, m...

RDRahul Dasschemasensei.hashnode.devDec 4, 2023 · 5 min read

Setting up a Multi-Node Hadoop Cluster on Google Cloud

In this tutorial, we will walk through the process of setting up a multi-node Hadoop cluster on Google Cloud. This cluster will consist of one master node and two worker nodes. We will be using Google Cloud VM instances for this setup, this tutorial ...

Rahul Das

About

Available for

Rahul Das's blogs

Recently published

Unlocking Real-Time Data with Change Data Capture (CDC)

Streaming data from Kafka to BigQuery using Apache Beam

Setting Up a Kafka Cluster on GCP VM Using Docker

Getting Started with PySpark

Setting up a Multi-Node Hadoop Cluster on Google Cloud

Search Hashnode

Rahul Das

About

Available for

Rahul Das's blogs

Recently published

Unlocking Real-Time Data with Change Data Capture (CDC)

Streaming data from Kafka to BigQuery using Apache Beam

Setting Up a Kafka Cluster on GCP VM Using Docker

Getting Started with PySpark

Setting up a Multi-Node Hadoop Cluster on Google Cloud