O'Reilly logo

Mastering Hadoop by Sandeep Karanth

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Compiling Pig scripts

The Pig architecture is layered to facilitate pluggable execution engines. Hadoop's MapReduce is an execution platform that is plugged into Pig. There are three main phases when compiling and executing a Pig script: preparing the logical plan, transforming it into a physical plan, and finally, compiling the physical plan into a MapReduce plan that can be executed in the appropriate execution environment.

The logical plan

The Pig statements are first parsed for syntax errors. Validation of the input files and input data structures happens during parsing. Type checking in the presence of a schema is done during this phase. A logical plan, a DAG of operators as nodes, and data flow as edges are then prepared. The logical plan ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required