Building an SLA
One of the main reasons for web operators to collect end-user data is to build an SLA. Even if you don't have a formal SLA with clients, you should have internal targets for uptime and page latency, because site speed has a direct impact on business experience.
User-facing SLAs have several components (see Table 11-2). You need to be specific about these so that there's no doubt whether an SLA was violated when someone claims that a problem occurred.
Table 11-2. The elements of a user-facing SLA
SLA component | What it means | How it's expressed | Example |
---|---|---|---|
Task being measured | The thing being tested—the business process or function itself | This is usually expressed as a name or description of the test; avoid using just the URL or page name as it makes the test harder to read. | "Updating a contact record" |
Metric being calculated | The element of latency that's being computed. If you can't control it, it shouldn't be in your SLA. | This is a measurement that is specific and can be reproduced across systems. You should know, for example, that "page load time" means "from the first DNS lookup to the browser's | "Host latency" |
Calculation | The math used to generate the number | Unfortunately, this is usually an average. Don't do this. Averages suck. Insist on a percentile (or at the very least a trimmed mean), and a single bad measurement won't ruin an otherwise good month. | "95th percentile" |
Valid times | The times and days when the metric is valid. If you don't include this, you won't have room ... |
Get Web Operations now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.