Disaster Recovery

Disaster recovery is the art of being able to resume normal systems operations when faced with a disaster scenario. What constitutes a disaster depends on your context. In general, I consider a disaster to be an anomalous event that causes the interruption of normal operations. In a traditional data center, for example, the loss of a hard drive is not a disaster scenario, because it is more or less an expected event. A fire in the data center, on the other hand, is an abnormal event likely to cause an interruption of normal operations.

The total and sudden loss of a complete server, which you might consider a disaster in a physical data center, happens—relatively speaking—all of the time in the cloud. Although such a frequency demotes such events from the realm of disaster recovery, you still need solid disaster recovery processes to deal with them. As a result, disaster recovery is not simply a good idea that you can keep putting off in favor of other priorities—it is a requirement.

What makes disaster recovery so problematic in a physical environment is the amount of manual labor required to prepare for and execute a disaster recovery plan. Furthermore, fully testing your processes and procedures is often very difficult. Too many organizations have a disaster recovery plan that has never actually been tested in an environment that sufficiently replicates real-world conditions to give them the confidence that the plan will work.

Disaster recovery in the cloud can ...

Get Cloud Application Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.