Memory-Efficient PyArrow Datasets
Jun 27, 2025 · 4 min read · After spending months figuring out how to handle datasets in a memory-efficient way, I decided to collect everything in one place. Hopefully, this will save someone else time. What Is a Dataset? If you work with Parquet files, you know each file has...
Join discussion