ggithiri.hashnode.devSetting Up AWS Glue with Docker and ExamplesAWS Glue is a fully-managed, serverless data integration service that makes it easy to move data between data stores. In this blog, we will show you how to set up AWS Glue using Docker and provide some examples to help you get started. Step 1: Instal...Feb 4, 2023·2 min read
ggithiri.hashnode.devDeploying a Data Pipeline Model using Terraform on AWSTerraform is an open-source infrastructure as code (IAC) tool that allows you to manage and provision your infrastructure resources. In this blog, we will show you how to deploy a data pipeline model using Terraform on AWS. Step 1: Install Terraform ...Feb 2, 2023·2 min read
ggithiri.hashnode.devDeploying a Data Pipeline on AWS with CloudFormationAWS CloudFormation is a service that helps you model and set up your Amazon Web Services resources so you can spend less time managing those resources and more time focusing on your applications that run in AWS. In this blog, we will show you how to ...Feb 1, 2023·2 min read
ggithiri.hashnode.devSpark StreamingSpark Streaming: Processing Big Data in Real-Time Big data processing has become an essential aspect of modern data management and analysis. With the growth of connected devices and the Internet of Things (IoT), organizations are faced with the chall...Jan 31, 2023·2 min read
ggithiri.hashnode.devAirflowAirflow: A Powerful Tool for Data Engineering Data engineering is an essential aspect of modern data management and analysis. It involves collecting, cleaning, transforming, and storing data in a way that enables organizations to make informed decisi...Jan 30, 2023·2 min read