Coefficient of Variation and Other Statistics

Whenever a mathematical function varies over time, there are a number of useful measures—or statistics—of interest. They may not tell everything there is to know about the function, but they can tell us some useful things.

One useful statistic is the mean, or average value. For a discrete series, the mean is the sum of the values divided by the number of terms. For a continuous function, the mean is the integral of the function divided by the length of the interval.

However, this doesn’t tell us how variable or flat the function is. For example, the set of values 100, 100, 100, 100, 100, 100 has the same mean as this set: 0, 200, –1000, 1200, 150, 50. Consequently, statisticians use metrics such as variance and standard deviation to describe variability.

However, standard deviation presents us with an issue. The series 0, 100, 0, 100, 0, 100 has the same standard deviation and variance as the series 1000, 1100, 1000, 1100, 1000, 1100. This may not seem like a big deal, but when we are considering real-world metrics, such as data center utilization, it represents a huge difference. Suppose that the values in the series represent the equivalent demand for servers over time. In the first case, if we build to peak and deploy 100 servers, we will only achieve 50% utilization (since the mean is 50 and the total number of servers deployed is 100). This 50% utilization means that, on average, for every server we have productively employed, ...

Get Cloudonomics: The Business Value of Cloud Computing, + Website now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.