O'Reilly logo

Open Sources by Chris DiBona, Sam Ockman

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Open Source Software Engineering

An Open Source project can include every single one of the above elements, and to be fair, some have. The commercial versions of BSD, BIND, and Sendmail are all examples of the standard software engineering process—but they didn't start out that way. A full-blown software engineering process is very resource-hungry, and instantiating one usually requires investment, which usually requires some kind of revenue plan.

The far more common case of an open-source project is one where the people involved are having fun and want their work to be as widely used as possible so they give it away without fee and sometimes without restrictions on redistribution. These folks might not have access to so-called "commercial grade" software tools (like code coverage analyzers, bounds-checking interpreters, and memory integrity verifiers). And the primary things they seem to find fun are coding, packaging, and evangelizing—not QA, not MRDs, and usually not hard and fast ship dates.

Let's revisit each of the elements of the software engineering process and see what typically takes its place in an unfunded Open Source project—a labor of love.

Marketing Requirements

Open Source folks tend to build the tools they need or wish they had. Sometimes this happens in conjunction with one's day job, and often it's someone whose primary job is something like system administration rather than software engineering. If, after several iterations, a software system reaches critical mass and takes on a life of its own, it will be distributed via Internet tarballs and other users will start to either ask for features or just sit down and implement them and send them in.

The battleground for an open-source MRD is usually a mailing list or newsgroup, with the users and developers bantering back and forth directly. Consensus is whatever the developers remember or agree with. Failure to consense often enough results in "code splits," where other developers start releasing their own versions. The MRD equivalent for Open Source can be very nurturing but it has sharp edges—conflict resolution is sometimes not possible (or not attempted).

System-Level Design

There usually just is no system-level design for an unfunded Open Source effort. Either the system design is implicit, springing forth whole and complete straight from Zeus's forehead, or it evolves over time (like the software itself). Usually by Version 2 or 3 of an open-source system, there actually is a system design even if it doesn't get written down anywhere.

It is here, rather than in any other departure from the normal rules of the software engineering road, that Open Source earns its reputation for being a little bit flakey. You can compensate for a lack of a formal MRD or even formal QA by just having really good programmers (or really friendly users), but if there's no system design (even if it's only in someone's head), the project's quality will be self-limited.

Detailed Design

Another casualty of being unfunded and wanting to have fun is a detailed design. Some people do find DDDs fun to work on, but these people generally get all the fun they can stand by writing DDDs during their day jobs. Detailed design ends up being a side effect of the implementation. "I know I need a parser, so I'll write one." Documenting the API in the form of external symbols in header files or manpages is optional and may not occur if the API isn't intended to be published or used outside of the project.

This is a shame, since a lot of good and otherwise reusable code gets hidden this way. Even modules that are not reusable or tightly bound to the project where they are created, and whose APIs are not part of the feature deliverables, really ought to have manpages explaining what they do and how to call them. It's hugely helpful to the other people who want to enhance the code, since they have to start by reading and understanding it.


This is the fun part. Implementation is what programmers love most; it's what keeps them up late hacking when they could be sleeping. The opportunity to write code is the primary motivation for almost all open-source software development effort ever expended. If one focuses on this one aspect of software engineering to the exclusion of the others, there's a huge freedom of expression.

Open-source projects are how most programmers experiment with new styles, either styles of indentation or variable naming or "try to save memory" or "try to save CPU cycles" or what have you. And there are some artifacts of great beauty waiting in tarballs everywhere, where some programmer tried out a style for the first time and it worked.

An unfunded Open Source effort can have as much rigor and consistency as it wants—users will run the code if it's functional; most people don't care if the developer switched styles three times during the implementation process. The developers generally care, or they learn to care after a while. In this situation, Larry Wall's past comments about programming being an artistic expression very much hit home.

The main difference in an unfunded Open Source implementation is that review is informal. There's usually no mentor or peer looking at the code before it goes out. There are usually no unit tests, regression or otherwise.


Integration of an open-source project usually involves writing some manpages, making sure that it builds on every kind of system the developer has access to, cleaning up the Makefile to remove the random hair that creeps in during the implementation phase, writing a README, making a tarball, putting it up for anonymous FTP somewhere, and posting a note to some mailing list or newsgroup where interested users can find it.

Note that the comp.sources.unix newsgroup was rekindled in 1998 by Rob Braun, and it's a fine place to send announcements of new or updated open-source software packages. It also functions as a repository/archive.

That's right, no system-level testing. But then there's usually no system-level test plan and no unit tests. In fact, Open Source efforts are pretty light on testing overall. (Exceptions exist, such as Perl and PostgreSQL.) This lack of pre-release testing is not a weakness, though, as explained below.

Field Testing

Unfunded open-source software enjoys the best system-level testing in the industry, unless we include NASA's testing on space-bound robots in our comparison. The reason is simply that users tend to be much friendlier when they aren't being charged any money, and power users (often developers themselves) are much more helpful when they can read, and fix, the source code to something they're running.

The essence of field testing is its lack of rigor. What software engineering is looking for from its field testers is patterns of use which are inherently unpredictable at the time the system is being designed and built—in other words, real world experiences of real users. Unfunded open-source projects are simply unbeatable in this area.

An additional advantage enjoyed by open-source projects is the "peer review" of dozens or hundreds of other programmers looking for bugs by reading the source code rather than just by executing packaged executables. Some of the readers will be looking for security flaws and some of those found will not be reported (other than among other crackers), but this danger does not take away from the overall advantage of having uncounted strangers reading the source code. These strangers can really keep an Open Source developer on his or her toes in a way that no manager or mentor ever could.


"Oops, sorry!" is what's usually said when a user finds a bug, or "Oops, sorry, and thanks!" if they also send a patch. "Hey, it works for me" is how Open Source developers do bug triage. If this sounds chaotic, it is. The lack of support can keep some users from being willing (or able) to run unfunded Open Source programs, but it also creates opportunities for consultants or software distributors to sell support contracts and/or enhanced and/or commercial versions.

When the Unix vendor community first encountered a strong desire from their users to ship prepackaged open-source software with their base systems, their first reaction was pretty much "Well, OK, but we're not going to support it." The success of companies like Cygnus has prompted reexamination of that position, but the culture clash runs pretty deep. Traditional software houses, including Unix vendors, just cannot plan or budget for the cost of sales of a support business if there are unreviewed changes being contributed by uncounted strangers.

Sometimes the answer is to internalize the software, running it through the normal QA process including unit and system testing, code coverage analysis, and so on. This can involve a reverse-engineered MRD and DDD to give QA some kind of context (i.e., what functionality to test for). Other times the answer is to rewrite the terms of the support agreement to "best efforts" rather than "guaranteed results." Ultimately the software support market will be filled by who can get leverage from all those uncounted strangers, since a lot of them are good people writing good software, and the Open Source culture is more effective in most cases at generating the level of functionality that users actually want (witness Linux versus Windows).

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required