Before relying on a new experimental device, a good physicist will establish its accuracy. A new detector will always have its responses to known input signals tested. The results of this calibration are compared against the expected responses. If the device is trustworthy, then the responses received will fall within acceptable bounds of what was expected. To make this a fair test, the accuracy bounds are set prior to the test. The same goes for testing in computational science and software development.
Code is assumed guilty until proven innocent. This applies to software written by other people, but even more so to software written by yourself. The mechanism that builds trust that software is performing correctly is called testing.
Testing is the process by which the expected results of code are compared against the observed results of actually having run that code. Tests are typically provided along with the code that they are testing. The collection of all of the tests for a given piece of code is known as the test suite. You can think of the test suite as a bunch of precanned experiments that anyone can run. If all of the tests pass, then the code is at least partially trustworthy. If any of the tests fail, then the code is known to be incorrect with respect to whichever case failed.
Now, you may have noticed that the test code itself is part of the software package. Since the tests are just as likely to have bugs as the code they are testing, it is tempting ...