19.2. Handling System Failure

Exception handling is one of the core ways to provide resilience within the application code itself. In Chapter 4 you saw that clustering can provide resilience against processes stopping unexpectedly, and that monitoring can help to identify when services and/or processes stop. This section covers what you can do in the application to provide resilience against some of the typical incidents that were listed in Chapter 11, some of which are listed again here:

  • Database transactions deadlocking and/or timing out

  • Null reference exceptions

  • Unhandled exceptions

  • "Access denied" and "file not found" issues

  • Network and connectivity issues

  • Third-party component issues

Resilience to these types of incidents (or exceptions) is typically achieved by including robust exception-handling routines.

Effective exception handing is more than implementing a try. . .catch block of code and raising an event and/or logging an exception when it occurs. This approach should be the minimum amount of acceptable exception handling within an application. Really effective exception handling starts with understanding:

  • What the code is going to do.

  • What it is going to be using and executing.

  • Which exceptions could occur when it executes.

  • And most important, what could be done to handle the exceptions.

There are five simple rules that I follow when thinking about exception handling:

  • Exception handling should never be an afterthought. Retro-fitting exception handling to an application is ...

Get Design – Build – Run: Applied Practices and Principles for Production-Ready Software Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.