kassem shehadykshehady.hashnode.dev·Nov 4, 2023An Introduction to Google Dataflow and Apache BeamIn today's data-driven world, the ability to process and analyze vast amounts of data efficiently is crucial for businesses and organizations of all sizes. Google Dataflow and Apache Beam are two powerful tools designed to help users manage and proce...Discussgoogle cloud
Nikhil Raonikhilrao.blog·Sep 13, 2023Google Dataflow Optimization: Streaming EngineWhat is Streaming Engine "By default, the Dataflow pipeline runner executes the steps of your streaming pipeline entirely on worker virtual machines, consuming worker CPU, memory, and Persistent Disk storage. Dataflow's Streaming Engine moves pipelin...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 5, 2023Apache Beam: WindowingWhat are Windows? Windows are a way to group your data by their event times. But, why do you want to group on time? So you can apply aggregations! An example might be if you have a stream of analytic data coming from mobile phones and want to count t...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 4, 2023Apache Beam: Values TransformOverview What if you only care about the values of your PCollection and not necessarily the keys? Maybe you have KV pairs for words and their counts in a string (KV<String, Integer>) and want to use the counts in another PCollection. You should use t...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 4, 2023Apache Beam: WithKeys TransformOverview If you need to convert a PCollection into a KV pair using dynamically generated keys derived from the input element, you should check out the WithKeys transform. When You Should Use the WithKeys Transform When you only want to convert a PCol...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 4, 2023Apache Beam: ToString TransformOverview If you want to convert every input element from a PCollection to a string, you should check out the ToString transforms. It can do everything from simply converting an object to a string by implicitly calling its toString() method to concate...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 4, 2023Apache Beam: RegexOverview Do you have a custom DoFn that applies a regular expression (regex) pattern to the input element? Well, did you know there is a built-in Apache Beam transform called Regex which can simplify your code? When You Should Use the Keys Transform ...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 4, 2023Apache Beam: Partition TransformOverview If you want to split a PCollection into multiple PCollections based on a function, use the Partition transform. When You Should Use the Keys Transform When you want to split the input PCollection into multiple PCollections. You might want to...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 3, 2023Apache Beam: ParDoOverview If you want to apply a function which transforms each element in an input collection, you should use the ParDo transform. When You Should Use the ParDo Transform Any time you want to perform some generic processing function on each element o...DiscussApache Beam
Nikhil Raonikhilrao.blog·Sep 3, 2023Apache Beam: KvSwapOverview What if you have a Key-Value pair, but want to group on the values, not the keys? Should you write a custom doFn to switch the keys and values? NO! You should use the KvSwap transform! When You Should Use the KvSwap Transform When you want t...DiscussApache Beam