Fault Tolerance and Disaster Recovery

The ability to recover from failures is critical to the proper function of any system, including Operations Manager. Although the two concepts are closely related, fault tolerance and disaster recovery are fundamentally different.

Fault tolerance is the ability to continue operating even in the event of a failure. This ensures that failures don’t result in loss of service. Fault-tolerance mechanisms, such as clustering or load-balanced components, have activation times typically measured in seconds or minutes. These mechanisms typically also have high costs associated with them, such as duplicated hardware.

In contrast, disaster recovery is the ability to restore operations after a loss of service. This ...

Get Windows Server® 2012 Unleashed now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.