Now, we know what the Cascading record looks like. How do we process these records? How do we move and manipulate data? Cascading provides us with the concept of pipes. Pipes control how data is managed during the processing segment.
Pipes are things that do stuff. The Cascading API allows the developer to assemble pipe assemblies that split, merge, group, or join streams. As data moves through pipes, streams may be separated or combined for various purposes:
Some pipes, such as
GroupBy, and the
Join classes, perform single actions on entire Tuple streams. ...