O'Reilly logo
  • Brandon Reeser thinks this is interesting:

3.1. Working with files in HDFS

HDFS is a filesystem designed for large-scale distributed data processing under frameworks such as MapReduce. You can store a big data set of (say) 100 TB as a single file in HDFS, something that would overwhelm most other filesystems. We discussed in


Cover of Hadoop in Action


Use this for next week's project