Reusing types smartly

Often, Hadoop problems are caused by some form of memory mismanagement and nodes don't suddenly fail but experience slowdown as I/O devices go bad. Hadoop has many options for controlling memory allocation and usage at several levels of granularity, but it does not check these options. So, it is possible for the combined heap size for all the daemons on a machine to exceed the amount of physical memory.

Each Java process itself has a configured maximum heap size. Depending on whether the JVM heap size, OS limit, or physical memory is exhausted first, this will cause an out-of-memory error, a JVM abort, or severe swapping, respectively.

You should pay attention to memory management. All unnecessarily allocated memory resources ...

Get Optimizing Hadoop for MapReduce now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.