You are previewing Hadoop in Action.

Hadoop in Action

Cover of Hadoop in Action by Chuck Lam Published by Manning Publications

Chapter 10. Programming with Pig

This chapter covers

  • Installing Pig and using the Grunt shell
  • Understanding the Pig Latin language
  • Extending the Pig Latin language with user-defined functions
  • Computing similar documents efficiently, using a simple Pig Latin script

One frequent complaint about MapReduce is that it’s difficult to program. When you first think through a data processing task, you may think about it in terms of data flow operations, such as loops and filters. However, as you implement the program in MapReduce, you’ll have to think at the level of mapper and reducer functions and job chaining. Certain functions that are treated as first-class operations in higher-level languages become nontrivial to implement in MapReduce, as ...

The best content for your career. Discover unlimited learning on demand for around $1/day.