O'Reilly logo

Mahout in Action by Ellen Friedman, Ted Dunning, Robin Anil, Sean Owen

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 6. Distributing recommendation computations

This chapter covers

  • Analyzing a massive data set from Wikipedia
  • Producing recommendations with Hadoop and distributed algorithms
  • Pseudo-distributing existing nondistributed recommenders

This book has looked at increasingly large data sets: from 10s of preferences, to 100,000, to 10 million, and then 17 million. But this is still only medium-sized in the world of recommenders. This chapter ups the ante again by tackling a larger data set of 130 million preferences in the form of article-to-article links from Wikipedia’s massive corpus.[1] In this data set, the articles are both the users and the items, which also demonstrates how recommenders can be usefully applied, with Mahout, to less ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required