Chapter 1. Unit Test Frameworks: An Overview

Most people who write software have at least some experience with unit testing. If you have ever written a few lines of throwaway code just to try something out, you’ve built a unit test. On the other end of the software spectrum, many large-scale applications have huge batteries of test cases that are repeatedly run and added to throughout the development process. Unit tests are useful at all levels of programming.

What are unit test frameworks and how are they used? Simply stated, they are software tools to support writing and running unit tests, including a foundation on which to build tests and the functionality to execute the tests and report their results. They are not solely tools for testing; they can also be used as development tools on a par with preprocessors and debuggers. Unit test frameworks can contribute to almost every stage of software development, including software architecture and design, code implementation and debugging, performance optimization, and quality assurance.

Unit tests usually are developed concurrently with production code, but are not built into the final software product. The relationship of unit tests to production code is shown in Figure 1-1.

Production application and unit test framework
Figure 1-1. Production application and unit test framework

An application is built from software objects linked together. The unit tests use the application’s objects, but exist inside the unit test framework. This approach has a number of nice aspects. The production code is not cluttered up with built-in unit tests. The size of the compiled application tends to be kept smaller for the same reason. The tests can be run separately from the application, so the objects can be tested in isolation.

A single unit test should test a particular behavior within the production code. Its success or failure validates a single unit of code. Well-written tests set up an environment or scenario that is independent of any other conditions, then perform a distinct action and check a definite result. These tests should avoid dependencies on the results of other tests (called test coupling ), and they should be short and simple. By starting with tests of the most basic functionality, then gradually building to tests of compound objects and behaviors, a unit test framework can be used to verify very complex architectures. Having such a test framework to build upon not only is much easier than developing standalone tests, but also produces more thorough, effective tests. A comprehensive suite of unit tests enables rapid application development, since the effects of every change can be immediately and thoroughly verified.

In the traditional jargon of testing, tests are categorized as black box or white box, depending on the amount of access to the internal workings of whatever is being tested. Functional and structural tests are related ideas. For example, a test that simply runs a program and checks its return code is a black box (functional) test, since nothing is known about how the program is written. Unit tests are usually white box (structural) tests, since the test framework is able to access the internal structure of the code being tested. Most object-oriented languages provide access protection, preventing outside classes from accessing protected or private code elements. Because of this, unit tests often are written to test only the public interfaces of the objects tested. This encourages the design of objects with discrete, testable interfaces and a minimum of complex hidden behavior. Thus, writing testable objects promotes good object-oriented development practices.

Another distinction is drawn between programmer and acceptance tests. Developers write programmer tests as they design and build code. These usually test low-level code elements, such as methods and interfaces. Acceptance tests may be specified or written by a nonprogrammer, such as a quality-assurance person or product manager. These generally are functional tests of high-level behavior, such as producing output or performing a user task. Unit tests may fall into either of these categories.

Test Driven Development

Unit test frameworks are a key element of Test Driven Development (TDD), also known as “test-first programming.” TDD is one of the most significant and widely used practices in Extreme Programming (XP) and other Agile Development methodologies. Test frameworks achieve their maximum utility when used to enable TDD, although they still are useful when TDD is not followed. This book concentrates on unit test frameworks as a family of tools, rather than specifically on TDD, but the two topics are closely related.

The key rule of TDD can be summarized as “test twice, code once,” by analogy to the carpenter’s rule of “measure twice, cut once.” “Test twice, code once” refers to the three-step procedure involved in any code change:

  1. Write a test of the new code and see it fail.

  2. Write the new code, doing “the simplest thing that could possibly work.”

  3. See the test succeed, and refactor the code.

These three basic steps are the TDD cycle .

Step 1 is to write a test, run it, and verify the resulting failure. The failure is important because it validates that the test fails as expected. It is often tempting to skip running the test and seeing the failure. Don’t.

In Step 2, code is written to make the test succeed. A wise guideline is doing “the simplest thing that could possibly work.” This may be a completely trivial implementation, such as having the new code return a constant value or copying and pasting code from one place to another. It doesn’t have to be pretty; it just has to pass the test. The temptation in this step is to do a little extra work and make some additional code change not directly related to passing the test. Again, don’t do this.

In Step 3, the test succeeds, verifying both the new code and its test. At this point, the new code may be refactored. Refactoring is a software engineering concept defined as “behavior-preserving transformation.” More formally, refactoring is the process of transforming code to improve its internal design without changing its external functionality. Within the TDD cycle, refactoring starts with the inelegant code that was written to pass the unit test and improves it by removing duplication or other ugliness. Since the unit test is in place, the details of how the code is implemented can be altered with confidence.

New code should only be written when a test fails. Code changes are only expected to occur when you are refactoring, adding new functionality, or debugging. Continuously repeating the TDD cycle is the most atomic level of the software development process. Software changes generally fall under two categories: adding new functionality or fixing bugs.

When adding new functionality, the first step is always to write a unit test that anticipates and uses the new code. After the unit test runs and fails, add the new code and re-test to verify success. The unit test has value aside from simply demonstrating that the new functionality works. Writing the test forces you to think in advance about the ideal design of the new code. Thus, in a sneaky and subtle way, TDD makes all new development part of a methodical, low-level software design process. Once the new unit test and functionality are in place, the unit test serves as the definitive, working example of how the new code is supposed to be used. For these reasons, time spent writing unit tests is not solely testing effort. Investments in testing are equal investments in design.

When debugging, you should first write a unit test that fails because of the bug. This is a useful effort in itself, because it determines exactly how the bug occurs. Once the unit test is in place and failing, fix the bug and re-run the test to verify that the bug is closed. Aside from fixing the bug, this process has the additional benefit of creating a test that will catch it. If the bug is ever re-introduced, the test will fail and highlight the problem.

By following the TDD cycle, you can come as close as humanly possible to writing flawless code on the first tryin other words, “code once.” The process gives you a clear indication that a piece of work is done. When a new unit test is written and then fails, the task is halfway completed. You cannot move on to something else until the test succeeds.

Unit Testing and Quality Assurance

Unit test frameworks are valuable when used for automated software testing as part of a quality assurance (QA) process. In many software development groups, the QA process starts when new code is submitted, built, and unit tested. Often, the unit tests include not only programmer tests, but also acceptance tests designed or written by the QA team. If all the unit tests succeed, the code is provisionally accepted and sent to a QA engineer for inspection and testing.

Running the full suite of unit tests as the first step in QA has many benefits. Most importantly, the tests ensure that the code is solid the moment it has left the developers’ hands. No human intervention is required to run the tests and evaluate the results. Either they all succeed, or there is a failure. Such Boolean (true/false) results are ideal because an automated system can understand them. The success of the unit tests confirms that the developers’ assumptions are valid, and that the low-level functionality is working correctly at a level of scrutiny that functional tests can never achieve. When numerous developers are making changes at once, the unit tests provide confidence that nobody’s changes caused someone else’s code to break. Furthermore, unit tests help to provide accountability. Knowing exactly which test fails usually makes it apparent whose change broke things. “Breaking the build” once meant submitting code that caused a compile to fail, but now often refers to causing a unit test failure as well. Many teams employ heinous punishments (such as making the responsible developer buy donuts or beer for coworkers) to remind everyone that breaking the build is a serious offense. The failure of a unit test clearly places a high priority on fixing the problem. If TDD is followed rigorously, the code should never be left in a state in which a unit test fails.

Unit testing doesn’t replace all other types of testing. It is entirely possible to develop thoroughly unit-tested, completely bulletproof code that is lacking in usability and performance. Stress testing, performance testing, and usability testing usually are separate considerations from unit testing. QA effort is still necessary to try out the completed application, decide whether it performs acceptably in real-world conditions, observe how things work outside of a controlled development environment, and otherwise apply human judgment. There are elements of software functionality for which it is difficult or impossible to write good unit tests. These include GUI “look and feel,” responses to system events, interaction with distributed application components, and many other possibilities. Sometimes unit tests can be written to simulate these types of situations, but ultimately, there is no substitute for reality or for a user’s objective feedback.

Although manual QA testing is still important, unit tests are a powerful tool for QA. Developers who use test-centric development report dramatic improvements in software quality, speed of development, and ability to make significant design changes on the fly. These speed and quality advantages rapidly become apparent from the QA perspective as well.

Homegrown Unit Testing

Writing simple tests comes naturally to most programmers. The classic beginner exercise of writing a three-line program that prints “Hello world!” is a basic unit test of the development language and environment. Find a software shop with no unit test framework in place (if such a prehistoric place could possibly exist), and you may see developers writing their own little “toy programs” or “test utilities” to try out new code. The sad thing about this approach is that the toy programs are thrown away once the developer is done with them. Later, when something breaks, someone has to laboriously debug the production code, without benefit of the developer’s test.

Another common low-level testing technique is to build tests into the production code with ASSERT macros. In debug builds, the macro tests a condition and sends a message if it fails. In release builds, the macro is defined to be empty, so no test code is included. This allows a developer to sprinkle assertions throughout the code, reporting any condition that is worthy of someone’s attention. Asserts can be a useful thing to have in your software toolbox, but far less so than true unit testing. For an assert to be evaluated, the production code must be run to the point where it is defined. It’s not convenient for automated testing, since an automated system doesn’t know how to cause a particular assert to fire. Failures don’t leave the developer with a clear path to correct the problem. Fixing a failure is no guarantee that the same problem will not happen again under different circumstances. Reliance on testing with this type of assert is unlikely to produce high-quality software. It is a forerunner to formal unit testing, which uses test asserts contained within well-defined tests, rather than placed randomly in the production code.

Just as many developers take the initiative and write test programs to try out small pieces of code, it’s common to find developers putting together basic, home-grown unit test frameworks that take care of their testing needs. As demonstrated in the Chapter 2, a test framework can be just a few lines of code to run unit tests and report the results. Even a very simple framework can be the foundation for thorough testing of complex applications.

Get Unit Test Frameworks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.