This is a great breakdown of moving from a memory-crashing pipeline to a stable one. You mentioned identifying specific operations as the bottleneck—did you find Polars' lazy evaluation was the main factor in solving this, or was it more about the efficiency of its native implementations versus your previous method?