John Ryanarticles.analytics.today·Dec 10, 2024Snowflake Streams and Tasks: Best PracticesAs a Data Engineer, it’s vital to understand the techniques available for Change Data Capture (CDC). Using these methods, you can quickly identify whether incoming data has changed and whether you need to process the new/modified data. Snowflake Stre...384 readssnowflake
Gabriela Caldasbyte-sizedjourneys.hashnode.dev·Nov 21, 2024Designing Data Pipelines for Success: Best Practices for Scalability and Data QualityIn today’s world, businesses need accurate and readily available insights to stay competitive. Data engineering plays a crucial role in creating the infrastructure that makes this possible. From building efficient data pipelines to ensuring data qual...data-engineering
Harvey Ducayhddatascience.tech·Nov 13, 2024What is ETL in Data Engineering?So imagine you're making a smoothie - that's basically what ETL is in the data world. First, you Extract all your ingredients (data) from different places, like grabbing berries from the fridge, bananas from the counter, and yogurt from the store - j...etl-pipeline
Arpit Tyagidataminds.hashnode.dev·Oct 24, 2024How to Transfer All SQL Database Tables to Azure Data Lake in One Go?Step 1: Create an instance of Azure Data Factory: Step 2: Set up Linked Services: Step 3: Create a dataset for both (source-SQL Database and destination-Azure Data Lake): Step 4: Build the Data Pipeline with the help of datasets: First I will us...10 likesAzure Data Factorydata-engineering
Arpit Tyagidataminds.hashnode.dev·Oct 23, 2024How to Copy SQL Database Tables Post-Join in Azure Data Lake via ADF!!Step 1: Create an instance of Azure Data Factory: Step 2: Set up Linked Services: Step 3: Create a dataset for both (source-SQL Database and destination-Azure Data Lake): Step 4: Build the Data Pipeline with the help of datasets: Step 5: Test ...10 likesAzure Data FactoryAzure
Arpit Tyagidataminds.hashnode.dev·Oct 23, 2024Can we copy SQL Table to Pipe-Separated Files? - Let us see!!Step 1: Create an instance of Azure Data Factory: Step 2: Set up Linked Services: Step 3: Create a dataset for both (source-SQL Database and destination-Azure Data Lake): This is the most important step because we will select the pipe as a delim...10 likesAzure Data FactoryAzure
Ekemini Thompsonekeminithompson.hashnode.dev·Sep 7, 2024Data Engineering Task 2: Understanding ETL PipelinesFor your next task as an aspiring data engineer, your challenge is to write an article titled "Understanding ETL Pipelines: Extract, Transform, Load in Data Engineering." This article should introduce readers to the ETL process, explaining its signif...etl-pipeline
Shreyash Banteshreyash27.hashnode.dev·Sep 4, 2024ETL Process: A Beginner’s Guide 2Transform ⭐ The transform phase in the ETL (Extract, Transform, Load) process is where raw data is refined, organized, and prepared for analysis. This step is crucial because the data extracted from various sources often comes in different formats, w...Data Science
Shreyash Banteshreyash27.hashnode.dev·Aug 27, 2024ETL Process: A Beginner’s Guide 🚶♂️➡️What is ETL? ETL stands for Extract, Transform, Load. It is a core process in data engineering used to integrate data from multiple sources, transform it into a usable format, and load it into a target system, such as a data warehouse or data lake. ...etl-pipeline
Kumar Rohitkrohit-de.hashnode.dev·Aug 15, 2024Hello Spark on MinikubeMinikube is a beginner-friendly tool that lets you run a Kubernetes cluster on your local machine, making it easy to start learning and experimenting with Kubernetes without needing a complex setup. It creates a single-node cluster inside a virtual m...46 readsExperiments on Minikube 🚀sparksql