CHAPTER 12

image

HCatalog and Hadoop in the Enterprise

Previous chapters of this book explored various features of Hadoop. We started with the batch–oriented MapReduce programming model and discussed how this model can support data warehousing with Hive and data pipeline development with Pig.

One of the major obstacles of adopting Hadoop in the Enterprise is that Hadoop implementations still require a low-level understanding of the system, including working with files in the distributed file system. Users of databases and ETL systems are used to working with abstractions such as databases and tables, and Hive supports this abstraction. Hive is not ...

Get Pro Apache Hadoop, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.