Accessing the DynamoDB data using AWS EMR

AWS Elastic MapReduce (EMR) has hosted Hadoop as a service from Amazon. As Hadoop has become one of the most important ETL/analytics tools these days, it is very important to know how to access the DynamoDB data from EMR so that we can use it for analytics. In this recipe, we are going to see how to access the DynamoDB data from EMR for analytics/querying.

Getting ready

To get started, you need to have a DynamoDB table created, and you should have data in it. Also, you need to have a secret key created, which will be used to connect to the EMR cluster using Putty or ssh on the UNIX system. In case you haven't, read the documentation at http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMR_SetUp_KeyPair.html ...

Get DynamoDB Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.