Abhishek JaiswalforAbhishek Jaiswal's team blogdataplumbing.hashnode.dev·Jul 20, 2024Snowflake for Data Engineering:Introduction Snowflake has quickly become a leading cloud-based data warehousing platform, offering unparalleled flexibility, scalability, and performance. It's designed to handle the complex demands of modern data engineering, making it an essential...Discuss·10 likessnowflake
Abhishek JaiswalforAbhishek Jaiswal's team blogdataplumbing.hashnode.dev·Jul 15, 2024Comprehensive Guide to DBT (Data Build Tool)What is dbt? dbt (Data Build Tool) is an open-source command-line tool that helps analysts and engineers transform data in their data warehouse. It allows you to manage your data transformations with SQL in a version-controlled and collaborative envi...Discuss·10 likesdata-engineering
Abhishek JaiswalforAbhishek Jaiswal's team blogdataplumbing.hashnode.dev·Jul 9, 2024The Rise of Zero ETL: Revolutionizing Data IntegrationIn the rapidly evolving landscape of data management, businesses are constantly seeking more efficient ways to handle and utilize data. One of the most groundbreaking advancements in this arena is the concept of Zero ETL (Extract, Transform, Load). T...Discuss·1 likeETL
ogunniran sijiogsiji.hashnode.dev·Jul 2, 2024Building and Hosting a DBT Project with GitHub Actions CI/CD for FreeIntroduction In this guide, we'll set up a robust CI/CD pipeline for a dbt (data build tool) project using GitHub Actions. With GitHub Actions, we can automate testing and deployment of our dbt models, ensuring our data transformations are always up-...Discuss·1 like·52 readsdbt
Giang Ngogiangblackk.hashnode.dev·Jun 29, 2024Spark Connect - Streamline Apache Spark Data Pipeline Development in DagsterIntroduction In many years, the conventional way to develop data pipelines in Dagster with Apache Spark is submitting applications, which pose unique challenges to developers: Require a separate infrastructure (cluster of machines with a cluster man...Discussdagster
Chinmay Pandyachinmaypandya.hashnode.dev·Jun 25, 2024Enhancing Efficiency with Tensorflow PipelinesTraditional Pipelines What happens if you implement a pipeline traditionally. It takes up a lot of time and memory, before we can make observations. It is inefficient and has many limitations mentioned below. Limitations Performance: Traditional pi...Discuss·1 like·43 readsTensorFlow
Brandon ClappProbrandonclapp.com·May 25, 2024Apache Airflow: The Key to Scheduled Data PipelinesIn the rapidly evolving landscape of data engineering, orchestrating and automating complex workflows has become a fundamental necessity. Businesses are increasingly dependent on data-driven insights, requiring robust systems to manage the seamless f...DiscussETLapache-airflow
Kaushik IskaforPeerDB Blogblog.peerdb.io·May 7, 2024PeerDB Cloud is Now in Public Beta!🚀 Today, we're excited to announce that PeerDB Cloud is officially entering public beta. If you're a data engineer or an organization looking for a fast, simple, and cost-effective way to replicate data from Postgres to data warehouses such as Snowf...Discuss·1 like·153 readsdata-movement
Priti Biyanipriti-musings.hashnode.dev·May 6, 2024The Data Lineage AdvantageIn a data world as heavy as banking platforms, there is always an origin, data transformation with a set of rules flooded across multiple services and data is stored for using it further by downstream. To be a true data product, it should be addressa...Discuss·103 readsModernizing Wealth Management Systemlineage
Cloud Tunedcloudtuned.hashnode.dev·Apr 22, 2024Understanding Data Pipelines: Streamlining Data FlowUnderstanding Data Pipelines: Streamlining Data Flow In today's data-driven world, businesses are inundated with vast amounts of data generated from various sources. Managing, processing, and extracting insights from this data efficiently is crucial ...Discussdata pipeline