Avoiding alert fatigue

Alerts are key in any monitoring system, however, it is crucial to get the alerts "right". Sending warnings too often causes alert fatigue, and like in "the boy who cried wolf" story, your administrators will soon start to ignore real problems along with the white noise.

To avoid alert fatigue, it is helpful to be able to set up thresholds not only on event/metric values, but also on the minimal period of time (for instance, "the CPU is over 80% for longer than 10 minutes") or count (for instance, "the number of 408 HTTP responses in the last 5 minutes is greater than 10"), after which human involvement is considered necessary.

In addition, good monitoring tools can aggregate events that spike suddenly, by sending out ...

Get Serverless computing in Azure with .NET now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.