O'Reilly logo
  • Brandon Reeser thinks this is interesting:

3.1. Working with files in HDFS

HDFS is a filesystem designed for large-scale distributed data processing under frameworks such as MapReduce. You can store a big data set of (say) 100 TB as a single file in HDFS, something that would overwhelm most other filesystems. We discussed in

From

Cover of Hadoop in Action

Note

Use this for next week's project