CUCyril Ukwajiunorinjaeycyril.com·Jul 11, 2025 · 6 min readData Manipulation with PySparkSpark has become the de facto tool for processing large amounts of data. It is a distributed, in-memory engine with interfaces for numerous data stores which makes it scalable, fast and flexible. Platforms like Databricks and Snowflake (SnowPark) use...00
CUCyril Ukwajiunorinjaeycyril.com·Sep 9, 2023 · 4 min readTool Spotlight: DuckDBThis article was originally posted on Linkedin and has only been moved here recently DuckDB is a free and open-source, lightweight, relational database management system designed for OLAP workloads characterized by more complex and longer-running que...00