Twitter trending topics using Spark streaming

In the previous recipe, we took a look at the SQL integrations of Spark. In this recipe, we are going to explore yet another powerful module called Spark Streaming. As the name suggests, Spark Streaming can listen to a stream of events and process data as and when it arrives.

Getting ready

To perform this recipe, you should have Hadoop and Spark installed. You also need to install Scala. I am using Scala 2.11.0. You should also have a Twitter account and some keys and tokens.

How to do it...

Spark streaming supports input from various sources such as Flume, HDFS, Kafka, Twitter, and so on. In this recipe, we are going to use Spark Streaming's Twitter source where we will be listening to streaming tweets ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.