10. Data Protection, File Formats and Accessing HDFS

This chapter covers the following:

Image Safeguarding HDFS data using trash and HDFS snapshots

Image Ensuring data integrity with file system checks (fsck command)

Image File-based formats supported by Hadoop

Image Choosing the optimal file format

The Hadoop small files problem and merging files

Using Hadoop archives ...

Get Expert Hadoop® Administration now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.