Alex Mercedalexmerced.hashnode.dev·Oct 7, 2024Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python NotebookApache Iceberg Crash Course: What is a Data Lakehouse and a Table Format? Free Copy of Apache Iceberg the Definitive Guide Free Apache Iceberg Crash Course Iceberg Lakehouse Engineering Video Playlist Data engineers and scientists often work wit...DiscussPython
Filippo Mamelimameli.hashnode.dev·Oct 6, 2024FeaturedRiding the Rust Wave: A Journey from Pandas to Polars in Data AnalysisThere is an interesting trend in the data community involving the rebuilding of popular products using Rust. Polars is one example of this, and it has really caught my interest. YADL (Yet Another Dataframe Library)? I've seen many articles claiming t...Discuss·21 likes·212 readsRust
Anix LynchProanixblog.hashnode.dev·Oct 3, 202420 Polars concepts with Before-and-After Examples1. Creating DataFrames (pl.DataFrame) 🏗️ Boilerplate Code: import polars as pl Use Case: Create a DataFrame to hold your data, similar to pandas. 🏗️ Goal: Store and manipulate data in a high-performance DataFrame structure. 🎯 Sample Code: # Creat...DiscussMatplotlib
Jason Shiersrandomforest.hashnode.dev·Sep 7, 2024NS&I Premium Bonds: Insights from a Monte Carlo ExperimentIn Part 1 of this series, we explored the characteristics of NS&I Premium Bonds and created a function to simulate the results of a single monthly prize draw for a given bond holding. In Part 2, we built a Monte Carlo experiment, simulated 6 million ...DiscussMonte Carlo Simulation of NS&I Premium Bondsmonte carlo
Sandeep PawarProfabric.guru·Aug 4, 2024Quick Test : Daft With Ray In Fabric💡 This is not a benchmark. This is one specific instance of a test, on a specific dataset using specific transformations. My goal was just to see how to set up daft + ray in Fabric and compare for the sake of learning. Daft is a distributed query ...Discuss·436 readsquicktest
Noufal Salimblog.noufals.in·Jun 9, 2024Unlocking the Potential: Evaluating the Best Data Processing Frameworks for Your Needs - A Comparative Study of Pandas, Dask, and PolarsAbstract As the field of data processing and analysis continues to advance, it is becoming increasingly crucial to select the appropriate tools for the task at hand. The purpose of this white paper is to present an extensive comparison of three well-...DiscussPolars
ayoub arahmatayoubar.hashnode.dev·Jan 7, 2024How do I convert a Django QuerySet into a Polars Dataframe?The Problem: In my software engineering journey, I faced a problem while working on a Django project when I had to grab a bunch of data from a PostgreSQL server and mess around with it to do some data analysis. The usual way to do this is by running ...Discuss·114 readsDjango
Sreekesh Iyerblog.sreekeshiyer.com·Nov 12, 2023Getting started with Polars - A Pandas Alternative?The first thing that you tend to learn when you start with Data Science in Python is the pandas module, which covers literally any kind of data manipulation, transformation and processing for tabular datasets. As a module, it has developed itself ins...Discuss·11 likes·43 readsPolars
Ben Hammondblog.benhammond.tech·Sep 14, 2023Shaving hours off a Pandas scriptCrunching a few hundred million lines of data At the Health Equity Tracker, the largest dataset we work with is a case-level set provided by the CDC with every COVID-19 infection in the United States, along with additional information including race,...Discuss·107 readsDataVisualization
DataChefforDataChef's Blogblog.datachef.co·Aug 15, 2023Choosing the Best Data Manipulation Package in Python: A Comprehensive ComparisonIntroduction Pandas is one of the most widely used data manipulation libraries in Python, known for its ease of use and powerful functionality. However, as the data size grows, Pandas can become slow and memory-intensive. In this blog post, we will c...Discuss·187 readsKoalas