O'Reilly logo

Effective Monitoring and Alerting by Slawek Ligus

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 3. Alerting

Some people believe that alerting is an art for which proficiency takes long years of trial and error. Perhaps, but most of us can’t wait that long. I prefer to view alerting as an exact science based on logic and probability. It’s about balancing two conflicting objectives: sensitivity, or when to classify an anomaly as problematic, and specificity, or when is it safe to assume that no problem exists. These objectives pull your alerting configuration in two opposite directions. Figuring out the right strategy is not a trivial task, but its effectiveness can be measured. The right choice depends on organizational priorities, the level of recovery built into the monitored system, and the expected impact when things go awry. At any rate, there is nothing supernatural about the process; getting it right is well within everyone’s reach.

The Challenge

In my experience, it’s simply impossible to maintain focused attention on a timeseries in anticipation of a problem. The vast amount of information running through the system generates a great number of timeseries to watch. Hiring people solely for the purpose of watching performance graphs is not very cost effective, and it wouldn’t be a very rewarding job either. Even if it was, though, I’m still not convinced that a human operator would be better at recognizing alertable patterns than a machine.

The process of alerting is full of unstable variables of a qualitative nature, and it presumes an element of responsibility. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required