Gabriela CaldasforByte-sized Journeybyte-sizedjourneys.hashnode.dev·Nov 21, 2024Designing Data Pipelines for Success: Best Practices for Scalability and Data QualityIn today’s world, businesses need accurate and readily available insights to stay competitive. Data engineering plays a crucial role in creating the infrastructure that makes this possible. From building efficient data pipelines to ensuring data qual...Discussdata-engineering
Sachin Nandanwarwww.azureguru.net·Nov 4, 2024Implement Medallion architecture with SCD Type 2 in Microsoft Fabric - Part 1This article is part one of a two-part series on implementing Medallion architecture in Microsoft Fabric. In the first part, we will see how to push data from its raw format to a Bronze layer and the fundamental steps involved in this initial transfo...Discuss·126 readsMedallion architecture with SCD Type 2 in Microsoft Fabricmicrosoftfabric
Shaileshshaileshpashte.hashnode.dev·Sep 23, 2024Exploring AWS Athena, Redshift, and OpenSearch: AWS Search and Analytics OverviewIntroduction In the world of cloud computing, data is one of the most valuable assets an organization can manage. AWS offers a comprehensive suite of tools to analyze, process, and search through large datasets efficiently. Three of the key services ...DiscussAWS
Khadeer Khancloudkhanquest.hashnode.dev·Sep 21, 2024The Data Engineer's Guide to Lakes and Warehouses: Navigating Google Cloud SolutionsAs we navigate the complexities of big data, understanding the nuances between data lakes and data warehouses becomes crucial for designing scalable, efficient, and powerful data solutions. We'll examine how these architectural elements fit into the ...Discuss·1 likedata-engineering
Shreyash Banteshreyash27.hashnode.dev·Sep 9, 2024ETL Process: A Beginner’s Guide 3LOAD ⭐ Well, so far we extracted the data from the source and transformed it how difficult will be just to push the data to a location right? well it's different from just pushing the Final Dataframeto a location. how you load data depends on the req...DiscussData Science
Abhishek JaiswalforAbhishek Jaiswal's team blogdataplumbing.hashnode.dev·Aug 18, 2024A Comprehensive Guide to Cassandra DatabaseIntroduction to Cassandra Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle large amounts of data across many commodity servers, with no single point of failure. It's known for its linear scalability and fault toler...DiscussDatabases
Abhishek JaiswalforAbhishek Jaiswal's team blogdataplumbing.hashnode.dev·Jul 20, 2024Snowflake for Data Engineering:Introduction Snowflake has quickly become a leading cloud-based data warehousing platform, offering unparalleled flexibility, scalability, and performance. It's designed to handle the complex demands of modern data engineering, making it an essential...Discuss·10 likessnowflake
Sandeepa Bandara Thennakoonosga.hashnode.dev·Jun 6, 2024What is Hadoop ?🤔If You Interested In Big Data analytic Field You Definitely Know This Framework ! Ok Let's See What is Hadoop & How It Works With Big Data analytic Hadoop is an open-source framework that allows for the distributed processing of large data sets acros...Discuss·18 likesbig data analytics
Prakhar Srivastavaprakhar1209.hashnode.dev·May 20, 2024DAG in Apache Spark in 10 points🍀Directed Acyclic Graph (DAG): In Apache Spark, the execution plan is represented as a DAG, a directed acyclic graph. It visually outlines the sequence of stages and tasks required to compute the final result. 🍀Logical and Physical Execution Plan...Discussbig data
Shiv IyerProshiviyer.hashnode.dev·May 3, 2024In-Depth Guide to Log-Structured Merge-Tree (LSM) Storage Systems and Their UsesLog-Structured Merge-tree (LSM) storage systems are designed to optimize write performance, especially in environments demanding high rates of data ingestion. Here's an in-depth look at the technical aspects of LSM and some specialized use cases demo...DiscussData Science