CHAPTER 22

Availability

A maxim suggests that you shouldn’t put all your eggs in one basket. Of course, whether this is sound advice depends on the cost of baskets, the value of the eggs, risk probabilities, threat vectors, and failure modes. Availability, performance, reliability, business continuity, and disaster recovery are all critical concerns for business operations and processes in general, and IT in particular. The cloud offers some unique advantages here.

“Availability” generally means the percentage of the time that the business is able to offer services. This is usually expressed as the percentage of uptime—that is, the ratio of uptime relative to total time, which is in turn uptime plus downtime. It can be impacted by scheduled downtime or unforeseen issues. There are two generic ways to maximize availability: (1) increase uptime—traditionally by using reliable components in a reliable architecture; and (2) reduce downtime, through accelerated processes for detecting, diagnosing, and repairing failures. However, the traditional architectural approach that builds on highly reliable architecture is succumbing to a new strategy that assumes that the foundation is inherently unreliable, in the same way that resilient buildings built on an unreliable tectonic foundation are now constructed to survive earthquakes.

“Performance” is a catchall term for a variety of factors, including response time—how quickly services are performed—and throughput—the volume of transactions. ...

Get Cloudonomics: The Business Value of Cloud Computing, + Website now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.