nkem.hashnode.devData Load Tool (dlt) Series : Creating Pipelines from Different Sources Part 2In the first article of this series, I created a pipeline, loaded toy data into Duckdb and viewed loaded info. I used dlt.pipeline and pipeline.run() methods I used DuckDB, sql_client and dlt dataset to view tables and query data. In this articl...Jan 25, 2025·7 min read
nkem.hashnode.devData Load Tool (dlt) Series: Getting Started with Pipelines Part 1Introduction Data Load Tool (DLT) is an open source library designed to simplify the process of extracting, loading and transforming (ELT) data from various, often messy data sources into well structured live datasets. Dlt loads data from a wide rang...Dec 30, 2024·5 min read
nkem.hashnode.devUnderstanding File Formats in Data EngineeringIntroduction In the world of data engineering, the choice of file format is a crucial decision that can significantly impact the efficiency and effectiveness of your data pipeline. With popular options like CSV, JSON, Avro, Parquet, and ORC offering ...Jun 21, 2024·6 min read
nkem.hashnode.devStreamlining Database Management :Deploying PostgreSQL with Docker and PythonIntroduction Docker is an open-source platform that enables developers to automate the deployment, scaling, and management of applications using containerisation. It provides the capability to create, deploy, and manage containers, ensuring that appl...Jun 19, 2024·5 min read
nkem.hashnode.devLeveraging SQL Window Functions for Advanced Data ManipulationIntroduction Window functions are special types of functions in SQL that allow performance calculations in a set of rows within a query, known as a window. This is comparable to calculations done using aggregate functions such as SUM, AVG, MIN, MAX a...Feb 7, 2024·8 min read