ShahinforDataChef's Blogblog.datachef.co·Dec 20, 2023Mysterious Spark Checkpoints BehaviourIt all started from a change in the checkpoint path of our Spark applications. We use Spark Structured streaming and AWS S3 buckets to maintain checkpoints. Let’s say we were using s3://bucket/spark/topic/ as the checkpoint path, and we changed it to...Discuss·583 readsStructured Streaming
Shahinshahin.blog·Oct 12, 2023Composing Functions for PySpark's Structured StreamingIn PySpark's structured streaming, striving for modular, reusable code while ensuring adaptability for each unique use case is a common challenge. I recently had the task of leveraging the broad capabilities of the writeStream method without losing t...Discuss·70 readsspark