Chapter 6. Batch Is a Special Case of Streaming

So far in this book, we have been talking about unbounded stream processing—that is, processing data from some time continuously and forever. This condition is depicted in Figure 6-1.

Unbounded stream processing: the input does not have an end, and data processing starts from the present or some point in the past and continues indefinitely.
Figure 6-1. Unbounded stream processing: the input does not have an end, and data processing starts from the present or some point in the past and continues indefinitely.

A different style of processing is bounded stream processing, or processing data from some starting time until some end time, as depicted in Figure 6-2. The input data might be naturally bounded (meaning that it is a data set that does not grow over time), or it can be artificially bounded for analysis purposes (meaning that we are only interested in events within some time bounds).

Bounded stream processing: the input has a beginning and an end, and data processing stops after some time.
Figure 6-2. Bounded stream processing: the input has a beginning and an end, and data processing stops after some time.

Bounded stream processing is clearly a special case of unbounded stream processing; data processing just happens to stop at some point. In addition, when the results of the computation are not produced continuously during execution, but only once at the end, we have the case called batch processing (data is processed “as a batch”).

Batch processing is a very special case of stream processing; ...

Get Introduction to Apache Flink now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.