Kyle Sheltonchaoskyle.com·Aug 13, 2023Data Engineering for DevOps EngineersIntroduction Have you ever gone camping? If you have, then you know that it's important to have a plan. You need to know where you're going, what you're going to do, and what supplies you need. Data engineering is a lot like camping. You need to have...DiscussData Science
Victor Chabachaba.hashnode.dev·Aug 4, 2023A Simple ETL Pipeline Automation Using Airflow on AWSAirflow is an open-source platform for creating, scheduling, and monitoring workflows. In this tutorial, we'll use Airflow to extract weather data from an API, transform the data, and load it into a CSV file in an S3 bucket. We'll start by creating a...Discussairflow
Flávio Regis Arrudaxboard.hashnode.dev·Jul 27, 2023Tip: set depends_on_past=True in Airflow when creating a forecast model pipeline💡 When creating forecasting model pipeline in Airflow set depends_on_past=True. Why? Forecasting models help us predict future data based on patterns and trends observed in historical data. These models output inherently depend on past observations...Discussairflow
Tanupriya Singhtanupriya.com·Jul 14, 2023My Journey as a Machine Learning EngineerIntroduction One of the questions I get as a Machine Learning Engineer is whether I am required to read research papers and be aware of the latest algorithms and models. The answer is no. Don’t get me wrong, it benefits from knowing the various model...Discuss·2 likes·521 readsMachine Learning
Rohan Anandrohan-anand.hashnode.dev·Jul 13, 2023Managing Python Version Dependency in Google Cloud ComposerIntroduction Google Cloud Composer is a managed Apache Airflow service that allows users to orchestrate and manage workflows in the cloud. However, one common challenge faced by users is when they need to use a Python package that requires a higher v...Discuss·11 likes·38 readsGCP
Rohan Anandrohan-anand.hashnode.dev·Jun 25, 2023Difference Between Reschedule Mode and Deferrable Flag in Airflow SensorsMotivation Recently my PR for adding the difference between Deferrable and Non-Deferrable Operators got merged in apache-airflow, you can see it here - PR Link. So I thought to explain it through a blog. Introduction Airflow sensors are a special kin...Discuss·38 readsairflow
z68z68mldev.hashnode.dev·May 26, 2023A Simple Guide to Airflow start_date and execution_dateIn Airflow, the start_date and execution_date can be very counterintuitive for those who are not familiar with them. TLDR In this article, I will explain why the execution_date in Airflow is different from what we might expect. The Usage of start_d...Discuss·33 readsPython
Prabodh AgarwalforCMD-LYNEtoplyne.hashnode.dev·May 17, 2023Cooking with SnowflakeThe Snowflake community is rife with information dumps on how to optimize expensive queries. We know because we combed through a ton of them. What we present here are three tactical ways in which we’ve done this at Toplyne. Introduction Toplyne’s bus...Discuss·7 likes·336 readssnowflake
Arun R Nairarunrnair.hashnode.dev·May 7, 2023Install Airflow in Windows using Docker in 8 steps.Apache Airflow is an open-source workflow management platform designed for data engineering pipelines. It was developed by Maxime Beauchemin at Airbnb in October 2014. Airbnb created Airflow to manage increasingly complex workflows and to programmati...Discuss·379 readsairflow
Sibbir Ahmmed Sihansihan.hashnode.dev·Apr 30, 2023Data Orchestration: Dagster with Google Drive ApiI have tried to recreate the crash course in my own way here. If you read through the documents, you will understand how to use dagster in 2023, June. First of all, we need to create a virtual environment. It is optional but it is highly recommended....Discuss·1.2K readsPipeline