8.3. What Are Alerts?

Alerts are typically displayed on the operations bridge for immediate attention, whereas other, less critical events and information are captured for future use. The operations team may be using a specific monitoring or alerting package that you need to understand. There may be some specifics that you can include within the application or configuration for improved monitoring or alerting. There's also the possibility that the solution uses multiple monitoring and filtering packages. The operations team will be monitoring everything, including low-level network components such as routers and load balancers. There can also be many different systems covering a wide variety of technologies. Certain information gathered by the monitoring master may need to be passed on or obtained by another monitoring solution.

An alert can be raised verbatim based on an event captured, or it can be based on a series of events or performance data. For example, CPU usage is a performance statistic. Based on technical requirements it might be necessary to raise an alert when the CPU usage is running above 90 percent for an extended period of time. This would indicate that a server is "running hot." Another example would be a number of warning events being escalated to a critical event or alert. For example, a system may have a process that is passing transactions to another system and the processing backlog is monitored at regular intervals. As throughput decreases and the backlog ...

Get Design – Build – Run: Applied Practices and Principles for Production-Ready Software Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.