Toolkits for verifying health (individual diagnostics)
To determine the health of a cluster, it is necessary to get the current state of all of the components that build it up in most installations:
Compute nodes
Ethernet network
InfiniBand network
Storage
In this chapter, we describe the IBM Cluster Health Check (CHC) toolkit, which is used to perform checks on these components. Working with these results, we are able to state if the cluster is healthy.
This chapter provides information about the following topics:

Get IBM High Performance Computing Cluster Health Check now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.