Chapter 16. Monitoring Fundamentals

You can observe a lot by just watching.

—Yogi Berra

Monitoring is the primary way we gain visibility into the systems we run. It is the process of observing information about the state of things for use in both short-term and long-term decision making. The operational goal of monitoring is to detect the precursors of outages so they can be fixed before they become actual outages, to collect information that aids decision making in the future, and to detect actual outages. Monitoring is difficult. Organizations often monitor the wrong things and sometimes do not monitor the important things.

The ideal monitoring system makes the operations team omniscient and omnipresent. Considering that having the root password ...

Get Practice of Cloud System Administration, The: DevOps and SRE Practices for Web Services, Volume 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.