How it works...

With this recipe, we introduced Spark Streaming using a technique many overlook, which allows us to craft a streaming application utilizing Spark's QueueInputDStream class. The QueueInputDStream class is not only a beneficial tool for understanding Spark streaming, but also for debugging during the development cycle. In the beginning steps, we set up a few data structures, in order to generate pseudo random clickstream event data for stream processing at a later stage.

It should be noted that in step 12, we are creating a streaming context instead of a SparkContext. The streaming context is what we use for Spark streaming applications. Next, the creation of a queue and queue stream is done to receive streaming data. Now steps ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.