Types of Evidence and Their Strengths and Weaknesses

For the sake of illustration, let’s consider an example. Imagine that we are assessing a new software engineering technology, AWE (A Wonderfulnew Excitement), which has been developed to replace BURP (Boring but Usually Respected Predecessor). What sort of evidence might we consider in deciding whether to adopt AWE? In the following sections, we describe common types of studies and evaluate each for the issues that typically arise for credibility and relevance.

Controlled Experiments and Quasi-Experiments

Controlled experiments are suitable if we want to perform a direct comparison of two or more conditions (such as using AWE versus using BURP) with respect to one or more criteria that can be measured reliably, such as the amount of time needed to complete a certain task. These experiments also often can be helpful if measurement is tricky, such as counting the number of defects in the work products produced with AWE or BURP. “Control” means keeping everything else (other than exchanging AWE for BURP) constant, which we can do straightforwardly with the work conditions and the task to be solved. For the gazillions of human variables involved, the only way to implement control is to use a group of subjects (rather than just one) and count on all the differences to average out across that group. This hope is justified (at least statistically) if we assign the subjects to the groups at random (a randomized experiment), which makes ...

Get Making Software now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.