Cluster Failures

This section describes how the Sun Cluster 3.0 architecture handles the complex system failures presented in “Failures in Complex Systems” and “Failures in Clustered Systems”.

Failure Detection

No single component within the product is responsible for the detection and recovery from failures. Instead, components such as the public network infrastructure and the applications rely on their own fault probes to determine the condition of their particular service. Sun Cluster 3.0 implements a system of both local and remote fault probes, so you can distinguish connectivity problems from data service problems. Detection of failures in the disk subsystem and recovery from them lie with the volume management products.“Recoverable Failures ...

Get Designing Enterprise Solutions with Sun™ Cluster 3.0 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.