Chapter 9. Running Hadoop in the cloud

This chapter covers

  • Setting up a compute cloud with Amazon Web Services (AWS)
  • Running Hadoop in the AWS cloud
  • Transferring data into and out of an AWS Hadoop cloud

Depending on your data processing needs, your Hadoop workload can vary widely over time. You may have a few large data processing jobs that occasionally take advantage of hundreds of nodes, but those same nodes will sit idle the rest of the time. You may be new to Hadoop and want to get familiar with it first before investing in a dedicated cluster. You may own a startup that needs to conserve cash and wants to avoid the capital expense of a Hadoop cluster. In these and other situations, it makes more sense to rent a cluster of machines rather ...

Get Hadoop in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.