Kafka Twitter producer application

We are now ready to develop our Python-based Kafka producer application that will capture tweets about airlines that are being tweeted in real-time and then publish those tweets to the Apache Kafka twitter topic that we created previously. We will be using the following two Python libraries in order to develop our Kafka producer:

  • tweepy: This library allows us to access the Twitter API programmatically using Python and the consumer API keys and access tokens that we generated earlier
  • pykafka: This library allow us to instantiate a Python-based Apache Kafka client through which we can communicate and transact with our single-node Kafka cluster.
The following Python code file, called kafka_twitter_producer.py ...

Get Machine Learning with Apache Spark Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.