19.1. Human Error and User Behavior

As you already know, you can have a variety of different users, including both business users and technical users. In this section we're going to look at a couple of scenarios where the solution can provide both documented and programmatic resiliency to human errors and user behavior. The journey starts with one of the most common causes of system failure due to human error: configuration changes. The second scenario involves supporting and controlling user behavior.

19.1.1. Managing Configuration Changes

An application can't be totally resilient to every possibility. For instance, suppose a member of the service delivery or operations teams changed a configuration value beyond an "acceptable" threshold. What would the application do? Obviously, it depends on which setting was changed, why it was changed it, and what the solution was using it for. If the value was being used to "throttle" incoming requests, fewer requests would be passed in for processing. As a result, the end-to-end processing time should improve, but you'd also see a decrease in throughput because the number of transactions entering the system has been constrained. Is this what's expected? If you're monitoring both end-to-end processing time and throughput, these fluctuations would be highlighted in the monitoring console. However, you need to consider what the most appropriate throttle values and ranges for them should be. For example, the throttle may have an appropriate ...

Get Design – Build – Run: Applied Practices and Principles for Production-Ready Software Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.