© 2026 Hashnode
In this article, I’ll give you a beginner-friendly introduction to the Polars library in Python. Polars is an open-source library, originally written in Rust, which makes data wrangling easier in Python. The syntax of Polars is very similar to Pandas...

Streaming data processing has come a long way, so why stick to old methods and not use modern practices. Let me share my fresh perspective that can help you solve your problem. Inspiration from Batch Processing Batch Processing shines with below (tho...

Every data team knows the drill: a PM needs to “just take a quick look” at some Parquet data. That usually means asking an engineer to write SQL or spin up a tool to pull a few rows. It’s a small ask, but one that happens often enough to slow everyon...

Ever been bogged down by data pipelines crashing due to memory issues? It's a frustratingly common problem in data engineering projects. This post chronicles my experience of identifying and resolving memory bottlenecks in our data processing using t...

Apache Iceberg Crash Course: What is a Data Lakehouse and a Table Format? Free Copy of Apache Iceberg the Definitive Guide Free Apache Iceberg Crash Course Iceberg Lakehouse Engineering Video Playlist Data engineers and scientists often work wit...
