Appendix B. Hadoop built-in ingress and egress tools

In this appendix, we’ll look at built-in mechanisms to read and write to HDFS, including the NameNode’s embedded HTTP server, and Hoop, a REST-based HDFS proxy. This will help you understand what tools Hadoop provides out of the box. Chapter 2 provides higher-level techniques and approaches for data ingress and egress.

B.1. Command line

It’s easy to copy files to and from HDFS using the command-line interface (CLI). The put and get options will perform these tasks for you. The put option is more useful than the copyFromLocal option because it supports multiple file sources and it can also work with standard input. For example, to read from standard input and write to a file in HDFS, you’d ...

Get Hadoop in Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.