Summary

In this chapter, we talked about incidents, incident response, and alerting. We focused on figuring out when to alert, who to alert, and how to alert. We then talked about what to do once you have acknowledged an alert. We discussed how to communicate and then how to finally close the incident with an all clear message.

Having proper incident response set up will help your team to shorten the amount of time it takes to resolve incidents and promote a level of competence to the rest of the company. Your team will be able to show that it knows how to respond when things are bad, and keep its cool and bring the system back to a healthy state.

In the next chapter, we will talk about postmortems. Postmortems are the act of documenting and looking ...

Get Real-World SRE now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.