Batch processing versus real-time data processing

In batch processing, data is collected in batches and each batch is sent for processing. The batch interval can be anything from one day to one minute. In today's data analytics and business intelligence world, data will not be processed in a batch for more than one day. Otherwise, business teams will not have any insight about what's happening to the business in a day-to-day basis. For example, the enterprise data warehousing team may collect all the orders made during the last 24 hours and send all these collected orders to the analytics engine for reporting.

The batch can be of one minute too. In the Spark framework (we will learn Spark in Chapter 7, Large-Scale Data Processing Frameworks ...

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.