Chapter 13. Working with MapReduce

MongoDB is a document-based database used to tackle a large amount of data by companies such as Forbes, Bit.ly, Foursquare, Craigslist, and so on. In Chapter 12, Data Processing and Aggregation with MongoDB, we learned how to perform the basic operations and aggregations with MongoDB. In this chapter, we will learn how MongoDB implements a MapReduce programming model using Jupyter and PyMongo.

In this chapter, we will cover the following topics:

  • An overview of MapReduce
  • Programming model
  • Using MapReduce with MongoDB
  • Filtering the input collection
  • Grouping and aggregation
  • The most common words in Tweets

Tip

In the following link, we can find a list of production deployments of MongoDB:

https://www.mongodb.com/industries ...

Get Practical Data Analysis - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.