Troubleshooting

Monitoring and logging counters or additional information is all well and good, but it can be intimidating to know how to actually find the information you need when troubleshooting a problem with an application. In this section, we will look at how Hadoop stores logs and system information. We can distinguish three typologies of logs, as follows:

  • YARN applications, including MapReduce jobs
  • Daemon logs (NameNode and ResourceManager)
  • Services that log non-distributed workloads, for example, HiveServer2 logging to /var/log

Next to these log typologies, Hadoop exposes a number of metrics at filesystem (the storage availability, replication factor, and number of blocks) and system level. As mentioned, both Apache Ambari and Cloudera Manager, ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.