O'Reilly logo

Practical Machine Learning by Sunila Gollapudi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Hadoop 2.x

Until Hadoop 2.x, all the distributions were focused on addressing the limitations in Hadoop 1.x but did not deviate from the core architecture. Hadoop 2.x really changed the underlying architecture assumptions and turned out to be a real breakthrough; most importantly, the introduction of YARN. YARN was a new framework for managing Hadoop cluster, which introduced the ability to handle real-time processing needs in addition to the batch. Some important issues that were addressed are listed as follows:

  • Single NameNode issues
  • Dramatic increase in the number of nodes in the cluster
  • Extension to the number of tasks that can be successfully addressed with Hadoop

The following figure depicts the difference between the Hadoop 1.x and 2.x architectures ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required