Transforms

Transforms is a step in the pipeline or a data processing operation. One or more PCollections can be treated as input to the Transforms, and the end result is another PCollection as an output. We might have a branching pipeline or a pipeline with a repeated structure while being able to use conditionals, loops, and so on.

First we will understand core transforms, then composite transforms, and at the end, root transforms.

Core transforms represent basic or common processing operations that we might require to perform on the data. We pass processing logic as a function object, and this function is applied to the element that we receive as an input PCollection. We can have this function object on multiple Google Compute Engines. ...

Get Cloud Analytics with Google Cloud Platform now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.