How it works...

In this recipe, we created a simple data generation server to simulate a stream of voting data and then counted the vote. The following figure provides a high-level depiction of this concept:

First, we began by executing the data generation server. Second, we defined a socket data source, which allows us to connect to the data generation server. Third, we constructed a simple Spark expression to group by villain (that is, bad superheroes) and count all currently received votes. Finally, we configured a threshold trigger of 10 seconds to execute our streaming query, which dumps the accumulated results onto the console.

There ...

Get Apache Spark 2.x Machine Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.