O'Reilly logo

Learning Hadoop 2 by Garry Turkington, Gabriele Modena

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Getting started

We will use the stream.py script options to extract JSON data and retrieve a specific number of tweets; we can run this with a command such as the following:

$ python stream.py -j -n 10000 > tweets.json

The tweets.json file will contain one JSON string on each line representing a tweet.

Remember that the Twitter API credentials need to be made available as environment variables or hardcoded in the script itself.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required