Scalable processing frameworks

Hardware failure may cause disruption to the stream processing application. To avoid this common scenario, we always need a processing framework that offers built-in APIs to support continuous computation, fault tolerant event state management, checkpoint features in the event of failures, in-flight aggregations, windowing, and so on. Fortunately, all the recent Apache projects such as Storm, Spark, Flink, and Kafka do support all and more of these features out of the box. The developer can use these APIs using Java, Python, and Scala.

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.