Hadoop on Cloud

Hadoop is a distributed system and it is capable of running over thousands of distributed nodes. Hadoop mega clusters with thousands of nodes are already in production. In this book, we developed solutions on a single-node cluster. Such a setup is good for learning but not sufficient for a production environment. Setting up even a modest three- or five-node Hadoop cluster may not be very feasible at home due to the cost of hardware involved. Arranging the budgets for a five-node Hadoop cluster in a company will require you to go through a budgetary approval process and then order the hardware, which can be a time-consuming process.

Hadoop on Cloud offers a good alternative to having a multinode Hadoop setup in your own data center ...

Get Hadoop Blueprints now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.