Chapter 1. The Nature of Open Source

This book is a long and thorough answer to the question, is open source right for you? The intended audience is the typical Information Technology (IT) department that is charged with supporting a business with appropriate application of technology. This book is written from an IT department’s perspective and is organized around the common problems that face those who struggle in the trenches. The goal of Open Source for the Enterprise is to help technology and business executives determine whether they can benefit from using open source in their environments.

Open source began as free software built by thousands of volunteers who shared the results of their work without charging any fees. Billions of dollars of value has been created based on this simple structure. The adoption of open source software has become a cultural phenomenon. The basic facts regarding the growth of the open source movement are amazing.

Open source success stories are well known and more arise every week. For instance, the city of Munich chose OpenOffice.org, an open source suite of desktop applications, very publicly sticking a finger in the eye of Microsoft, which aggressively sought the contract. Amazon.com dumped Sun hardware and software in favor of Linux, the most popular open source operating system available. Apache, an open source web server architecture, is the most popular web server in the world. Perl, a robust scripting language, is used to run huge, highly scalable sites such as Ticketmaster. Large financial companies are creating massive clusters of Linux machines for crunching numbers in complex portfolio analysis. This is just the tip of the iceberg. The examples of corporate success for open source would fill a phone book.

Internationally, open source is being adopted by entire governments. Smaller communities are using it to create versions for their specific languages. China, Brazil, Thailand, Peru—are all adopting open source software officially and are spending millions to improve the software and encourage its adoption.

All of this success has changed the nature of open source. No longer can one assume that the typical open source project comprises a small band of programmers toiling away in obscurity. Major technology vendors got open source religion and made broad and long-term commitments to open source software. IBM released as open source its Eclipse platform for creating development tools, a project on which it spent $40 million. IBM has become the largest corporate proponent of Linux, and it spends hundreds of millions of dollars to support and market that platform. Hewlett-Packard uses open source in all sorts of ways, from supporting development of useful projects to releasing device drivers into the marketplace. Novell has purchased several major open source-related companies, and is creating a large and integrated collection of open source applications for enterprise use. Nearly every important enterprise-grade software product has support for Linux. Even commercial web servers based on Apache are available, including Hewlett-Packard’s Secure Web Server for OpenVMS.

Companies large and small have taken to open source as a way to increase collaboration, reduce development costs, provide a friendly platform for their products, and sell services.

For an IT department, the stakes can be high. Becoming the sort of IT department that can successfully use open source means empowerment, saving hard dollars and ensuring freedom from captivity to vendors. Other significant benefits include:

  • Saving money on license fees

  • Reducing support costs

  • Reducing integration costs

  • Avoiding vendor lock-in and gaining power in negotiations

  • Gaining access to the functionality of thousands of programs

  • Improving the value of IT to your business

But gaining these benefits comes with responsibilities. Installing open source does not mean all your problems are solved. To use open source and support it in a commercial environment, IT departments must learn to:

  • Develop and maintain skills required to install and configure open source

  • Increase their software development skills

  • Become experts in evaluating the maturity of open source

  • Improve their understanding of the technology requirements of the business

  • Understand and manage open source licensing issues, especially if their company distributes software applications

The Open Source Debate

One way of looking at this book is as a tour of the benefits and responsibilities of using open source. The opportunity provided by open source is too large to ignore for any organization that seeks to support its operations with software.

The scope of open source has grown beyond basic development tools to become a top-to-bottom infrastructure for computing of all stripes, including development environments, databases, operating systems, web servers, application servers, and utilities for all types of data center management. Open source now encompasses a huge variety of end-user applications, such enterprise applications as Enterprise Resource Planning (ERP) and Customer Relationship Management (CRM), tools such as portals and data warehouses, and integration tools for messaging as well as for web services. All of these can be the foundation of the sort of automation and productivity gains that can lead to a company’s competitive advantage.

But in most organizations, discussing open source brings up strong opinions on all sides that obscure pragmatic analysis of the key question: can you use open source profitably at your organization?

There is no simple answer to this question. People on both sides have good points to make and are also protecting their own interests. At its worst, the debate becomes a cartoonish farce.

Programmers, systems administrators, and other technologists who are fascinated by various open source programs might tout the fact that the software is free. While this is true, managers sometimes suspect a hidden agenda of seeking more cool toys to play with, without adequate consideration of the other costs that are incurred when using any piece of software, including the costs of evaluation, testing, installation, configuration, customization, upgrades, operations, and support.

Managers frequently take the opposite position, that open source is not worth considering because it can lack features of commercial software such as support and maintenance services, installation scripts, and documentation. For good reasons, managers like the idea of one throat to choke if something goes wrong. It is a remedy for the finger pointing that characterizes all commercial technology support in multivendor installations. But hiding behind this objection ignores the fact that technologists at tens of thousands of companies have proved that the risks and responsibilities of using open source are manageable.

One ideal that is rarely achieved is to merge the creativity and technical brilliance of the open source world with the operational discipline and process of IT. But the two sides look at each other with disdain. The open source experts look at IT and see a massive skills gap: what is so hard about picking up and maintaining the skills needed to use open source? The IT professionals look at open source software and frequently see a productization gap because of half-finished products: what is so hard about finishing all the administrative interfaces, configuration tools, Application Programming Interfaces (APIs), and documentation to make the software useful?

Both the skills and the productization gaps represent real challenges to wider adoption of open source. Organizations that can learn to overcome the skills and productization gaps and put open source to work will have an edge in terms of cost and flexibility over those that cannot.

Fortunately, the debate has moved out of the cartoonish phase, and many organizations are now taking up the real job of analyzing what kind of company they want to be, what their long-term needs are, and whether open source can play some sort of role.

The prudent course is to choose carefully when to use open source, based on a thorough understanding of what is involved. This book won’t answer every question for every different type of project. But it will show you how to evaluate open source software for common scenarios, and it will teach you how to get the answers to commonly asked questions and communicate them to others.

Understanding Your Open Source Readiness

Whether open source will work at any company depends on both the capabilities of the company and the maturity of the open source software. The fact is that some open source is so rickety that it isn’t useful to anyone except the most highly skilled. If you browse the popular directories of open source software, it doesn’t take long to find dormant projects that have not been updated for years. For example, Cheetah, a Python-powered template engine, was posted on the http://www.freshmeat.net site on July 15, 2001, was updated July 16, 2001, and hasn’t been touched since. Other open source software, such as the Apache Web Server, is at the other end of the quality spectrum; it is widely considered better in every way than all the commercial alternatives, and it is as easy to use. Thousands of projects occupy the space in between these two extremes.

Not all users of open source are equal. Given the IT budget of Amazon, Google, Yahoo!, or Ticketmaster and the pedigree of their engineering staff, it is no wonder that they can make open source work. They could write their systems in assembly language. But when you look far from the gurus of Silicon Valley and focus instead on the city of Houston, or on the Ernie Ball Company, a guitar-string maker that’s run entirely on open source software, you must realize that there is some middle ground. You don’t have to be an MIT Ph.D. to make open source work for your business or organization.

Getting It Right

The difference between the successful open source implementation, in which the value of open source is realized for a company, and the unsuccessful one, in which the struggle to use open source is not worth the effort, amounts to knowing your problem, knowing the software, and knowing yourself.

The key to a successful outcome in applying open source is a thorough understanding of answers to the following questions:

  • What problem are you trying to solve?

  • How would open source software help in providing the solution?

  • Does any open source software provide all or part of the solution?

  • How can the maturity and stability of relevant open source software be determined?

  • What skills are required to install, configure, customize, integrate, operate, and maintain the open source software?

  • Does your organization have the needed skills? If not, how can they be acquired and institutionalized?

  • In which cases does the value provided by the open source software exceed the cost of using and maintaining it, compared with other solutions?

This book approaches these questions in terms of skills, risks, and fully loaded costs. An IT department that intends to adopt open source must have not only the resources to do so, but also a belief in skills building and an inclination to take increased responsibility for its IT infrastructure. In Chapters 1 through 5 of this book, we analyze the nature of open source and describe three different models that can help companies evaluate the vast world of open source in a manner that is consistent and enables them to understand their own capabilities.

The models presented are:

Open Source Maturity model

A set of questions that help determine the stability and maturity of an open source project, the responsibilities involved in using a particular piece of open source, and the skills needed to manage those risks

Open Source Skills and Risk Tolerance model

A set of questions that help determine the ability of an organization to handle various risks and the tolerance of risk for a specific project

Software Cost and Risk model

A set of questions that help determine the total costs and risks of using open source as a solution for a project

With the information collected in using these models, half of the problem—knowing what you are getting yourself into—is solved. An IT department will be able to avoid choosing open source projects that are immature or ill-suited to its skills. The other half of the problem is finding the right open source project for a particular task, which can be vexing.

Finding and Evaluating Open Source

Finding open source that you can relate to your needs is all too easy. Go to http://www.sourceforge.net/ and you will find an uncharted jungle of more than 70,000 projects. At http://freshmeat.net/ more than 30,000 projects are listed. Finding the right open source for your needs and evaluating its maturity can be exhausting and time consuming. Most open source projects are useless to organizations and businesses focused on solving problems, but a small number are incredibly valuable.

Also available are more organized and higher-quality sources of open source software, such as the Apache Software Foundation and Tigris.org (supported by CollabNet). Although these sources offer a significantly smaller set of open source applications, they are relatively mature, stable, and useful.

And don’t be fooled into thinking that using open source requires you to master Linux. Plenty of open source programs work perfectly well on the Microsoft platform, including Apache, MySQL, and Perl.

Once you find an open source program that might fit your needs, a host of questions arise: how do you know how stable it is? How can you find out if someone will be around to help if you have a problem? How can you find others who are using it? None of these questions has a simple answer.

The bottom line is that if you set out to use open source, you must learn to evaluate the software’s maturity and the level of support provided by the community that surrounds the project so that you can understand the risks. That is what the Open Source Maturity model is all about.

The Nature of Open Source

The way that open source comes to life, evolves, and finds its way to new groups of users is a profoundly democratic, decentralized, and somewhat chaotic process. For commercial software, investors demand a plan reflecting what the software will do and who will buy it. Vendors pay for sales staff, marketing and advertising departments, and conferences and events to let potential buyers know about their products. The trade press offers a steady stream of product reviews. Analysts write reports on new types of products.

Open source is a grass-roots effort. Open source developers create code to meet their own needs, and throw it up on the Internet so that others can interact with it and make it better. Nobody buys you lunch. Nobody is going to call you on the phone and suggest that using open source is a good idea. In most cases, you will have to find out about open source software yourself. It will not come to you. This is slightly less true now than it used to be (see the upcoming sidebar, “Open Source Sales and Marketing”). IBM will call you about Linux, but the conversation will quickly get to hardware and services. Newly formed support companies are also encouraging use of open source, but none of this changes the fact that open source means taking responsibility.

The way that open source grows is an amazing demonstration of community evolution. It turns out that communities are not interested in documentation until late in the cycle, and even then the documentation does not tell you what you need to know about the project’s health and how well the software works.

So, when you go looking for open source to fill your needs, it can be difficult to understand what is happening with a particular project. For the most popular and widely used projects, a lot of information is available, including books, magazines, conferences, and even consultants offering services. But leaving the most popular products aside, there are many sources of raw data but a dearth of useful information.

If finding and evaluating open source is this difficult, one might ask, why bother? The reason is that open source has grown to such an extent that huge opportunities are waiting for IT departments. Many of the newly formed open source support companies are focused on drawing IT’s attention to these opportunities.

A clear model of how open source software comes to life and grows is crucial for the IT community to understand what they will find when they go looking for open source. This chapter explains the life cycle of open source: how open source is created, how it evolves, what you will find when you look at an open source project, and how to make sense of this evidence. Being able to evaluate open source is particularly important for projects outside the realm of usual suspects, where much value lies waiting.

To explore the nature of open source, we will present several definitions of open source, followed by a review of the life cycle most open source projects go through. Then, the differences in the life cycle of commercial software and the end product will be analyzed and compared to that of open source software.

What Is Open Source?

We’ve been sharing software since computers were invented. Significant portions of the early IBM operating systems, such as HASP (a print spooler), were developed in the field by users sharing and improving the software. IBM happily accepted that informally shared software, called it “field development,” and then included it in the operating system that helped run the huge mainframes that were the company’s vehicle for making money.

Today, this would not even be considered open source by its strictest definition. Many believe that software can be defined as open source only if it meets the 10 criteria in the “Open Source Definition” that is published and maintained by the Open Source Initiative (OSI; http://www.opensource.org/).

In the academic and scientific community, sharing software has always been a routine part of research and teaching activities. Many books tell the story of Arpanet, and how the Internet was developed and improved by sharing code over the network, with hardly a thought about licensing.

When PCs proliferated, starting in the early 1980s, a thriving exchange of software developed. This evolved into freeware , software that was available for use at no charge, and shareware , software that was available to try, but with the proviso that if you used it regularly, you should send in a small licensing fee. Extremely popular programs such as PKZIP, a file compression program created by Phil Katz, grew at amazing rates under this model.

Here we get to the key difference between open source and all other forms of software sharing. Open source is not just about giving away useful tools. It is about sharing source code and keeping it sharable. Remember that in open source, unlike in shareware or freeware, all of the source code used to build an application is shared, not just the executable version that allows you to run the program but not see how it works or be able to improve it. At its core, open source is about a cycle of innovation in which those who have the skills share ideas and build on each other’s work.

Richard Stallman originally defined free software as software that protected Four Freedoms for its users:

  • The freedom to run the program, for any purpose (freedom 0).

  • The freedom to study how the program works and adapt it to your needs (freedom 1). Access to the source code is a precondition for this.

  • The freedom to redistribute copies so that you can help your neighbor (freedom 2).

  • The freedom to improve the program and release your improvements to the public, so that the whole community benefits (freedom 3). Access to the source code is a precondition for this.

It takes a lot of work to create software, and programmers, while eager to share, are not generally eager to share and then have someone else decide to take their work and sell it. One key innovation that contributed significantly to the growth of open source software was the development of software licenses that prevented corporations from simply taking open source software and embedding it into their products. These open source licenses ensure that software developers can control the terms under which others can reuse the software they contribute to open source projects. Richard Stallman published the first open source license, known as the GNU General Public License (GPL). The GPL is a software license in which Richard specified licensing terms that he believed embodied the spirit of his Four Freedoms. Linux, among many other open source applications, is distributed under the GNU GPL.

The innovation of open source is the creation of the legal structures that were first used to define a way to share software, and to keep contributions to it shareable as well. This avoids what is known as the Free Rider Problem , whereby freely shared work is appropriated for commercial gain.

So, the first definition of open source has to do with licensing:

Open source software is distributed under a type of license that promotes sharing by preserving public availability of source code and preventing restrictions on the software’s use and distribution.

Literally hundreds of licenses are now considered to be open source in some form or another. The OSI, as mentioned earlier, approves licenses as “open source” based on conformance to a set of criteria known as the Open Source Definition (OSD). Currently, more than 50 licenses are approved as open source by the OSI.

The Free Software Foundation, founded in 1985 to promote the development of the GNU operating system and free software in general, administers the GNU GPL, which is also known as Copyleft:

Copyleft is a license that permits people to freely copy, modify, and redistribute software so long as they do not keep others from also having the right to freely copy, modify, and redistribute the software. Copyleft provisions in a license require that anyone modifying the software can distribute only their modified versions under the terms of the open source license they originally received with the software. You can actually sell Copyleft software (haven’t you seen box sets of Linux at the computer store?), but you must also offer the source for free, either with the product or available for free (or for the cost of copying/shipping) to anyone on request. Not all open source licenses contain Copyleft provisions (from http://c2.com/cgi/wiki?CopyLeft).

The zeal with which some people have taken up the cause of open source has created another definition of open source:

Open source is a social and political movement that promotes the idea that all software should be made available under terms that embody the Four Freedoms initially defined by Richard Stallman.

Richard Stallman is famous for saying that the word free in Free Software Foundation is meant to be like free speech, not like free beer. But one can make excellent use of open source without joining this movement and adopting its attitude toward information and private property.

The term open source itself is an attempt to remove the emphasis from free--as in “no cost"--and also to draw a distinction between Stallman’s orthodoxy and those who were less strident politically but wanted to promote the idea of collaborative development with guaranteed access to source code. The term was coined in January 1998 by Christine Peterson, then president of the Foresight Institute in Santa Clara, California, in the wake of Netscape’s announcement that it would publish the source code of its browser. Eric Raymond’s essay, “The Cathedral and the Bazaar,” which later became a book (see upcoming The Cathedral and the Bazaar), about the power of collaborative, community-based development, was mentioned by Netscape as having influenced its decision.

For the user of open source in the enterprise, it is important to understand how open source evolves, sometimes in a lurching manner, through a collaborative process that is long on communication and short on planning.

As the number of open source projects has grown, and higher-quality software projects have emerged and have had a significant impact on the market, engineers have noticed the benefits of the loosely structured way in which open source software is created. This leads to the third definition of open source:

Open source is a community-based, iterative, incremental, and evolutionary software development methodology that emphasizes experimentation and experience over planning and formal design.

As we will discuss later in this chapter, little centralized planning governs the development of open source software, yet the result is frequently profoundly better than approaches that emphasize up-front design. In this sense, open source can be thought of as evidence in favor of some of the principles behind so-called “Agile” development methodologies such as eXtreme Programming, which emphasizes rapid iterations. For the purposes of this book, understanding the nature of how open source development takes place is key to making effective use of open source software.

Companies such as VA Software and CollabNet have sprung up to provide this open source development methodology to enterprises, packaged as a set of services.

The final definition of open source is one that arrived only after open source became a successful way of forming communities and creating a safe environment for cooperation:

Open source is a collaboration and marketing technique that can bring people and organizations together.

The Python-based Zope application server, which was initially commercial software but found more success as an open source project, is one example of this trend. MySQL, a popular database program, is marketed as open source and as commercial software simultaneously. IBM’s Eclipse project is perhaps the most prominent example of this aspect of open source. After pouring more than $40 million into the development of the Eclipse framework for creating software development tools, IBM decided to convert the project to open source. This meant that anyone could copy the source code it had cost IBM so many millions of dollars to create. Why would IBM do such a thing?

Here is one analysis that explains IBM’s behavior: IBM needs a development environment to support its Java?-based development platform. That’s why the Eclipse project was started in the first place. But IBM realized that it was never going to make any money by selling development tools. IBM also saw that it would benefit greatly if other companies used its development tools and joined the development of Eclipse. There were two reasons for this. First, the more companies that signed on, the more credibility Eclipse would have. Second, having outside companies work on improving Eclipse would lower the cost of development. Finally, making Eclipse open source had a devastating effect on everyone else—including IBM’s competitors—who was trying to sell the same tools.

IBM explains the decision as an attempt to build trust in the Eclipse project among vendors, partners, and users, improve code quality, and encourage innovation. (Visit http://www.ibm.com/developerworks/linux/library/l-erick.html for details.)

As it turns out, releasing Eclipse as an open source project achieved all the goals mentioned earlier. The Eclipse Foundation (http://www.eclipse.org/) now governs Eclipse development, many companies have joined the project to help with development, and the platform is rapidly becoming one of the most popular integrated development environments, excelling not only at Java (which it was originally created for) but also across a diverse range of languages—an awesome example of the power of open source collaboration at work.

These definitions help describe the context in which open source exists. Now, let’s examine the way in which open source projects get started and how they grow.

Where Does Open Source Come From?

Most of the time, open source is born out of a need that leads to inspiration. Somebody somewhere who is frustrated, bored, or in some other state of creative readiness starts with an initial thought that begins with one of these phrases: “Wouldn’t it be cool if” or “I am sick of having to put up with...” or “I bet a lot of people would like...”. The end of these statements is a description of some sort of software. In open source parlance, this is called scratching a developer's personal itch .

Linus Torvalds thought it would be cool to have a full Unix implementation that ran on the Intel chip set, and he created Linux. Larry Wall was interested in a language to help him with system programming tasks, and he created the Perl language. A group of people building their own web sites were frustrated with the NCSA web server and started sharing patches to it; these patches became the Apache web server (a-patch-y server; get it?).

The key thing to remember is that at first the designers and builders of open source applications were the primary users as well. This is the first principle of open source:

Open source software is most frequently built by programmers for other programmers.

So, what follows inspiration? Well, hard work, of course. The inspired developer now sets to work, creating the masterpiece that will solve the problem at hand. There is no formal requirements-gathering process. There is no market research. There is nothing but a smart, driven person thinking about what he wants to do and then setting about doing it.

After toiling alone at a keyboard, the inspired developer creates something that he is proud of and then shares it with others. This is the real birth of an open source project. If other programmers are captivated by the way the software meets a need they also have, they will join the project as either users or developers of the software. If such a community forms, the open source project is on its way to faster development and wider recognition. The second principle of open source, then, is as follows:

Open source projects are communities of developers and users organized around software that meets a common need.

Where does open source come from? It comes from inspiration about a solution to a problem that is compelling and common enough to attract other people to join a community and work on the project for free. In addition, many companies allow developers to work on open source as a part of their jobs if the project is important to that company.

The implications of this are that open source projects usually form around needs that programmers have. The first generation of open source programs are almost all focused on programmers’ needs and ways of working.

So, if not money, what are the rewards of a successful open source project, besides scratching that itch? One major reward is status and peer recognition. Doing good and helping others is another factor that keeps many a programmer working late into the night. There is a strong ethic of community service in many developers, and a great sense of satisfaction is frequently derived from the knowledge that thousands of lives have been improved as a result of an open source project.

Developers are also motivated by rational self-interest. Aside from improving skills in general, thus becoming increasingly marketable, the developer becomes part of a great team. By releasing code as open source, a programmer can get significant help improving his software if a community forms around it. Bug fixes and enhancements in areas outside of a developer’s area of interest are benefits of a successful open source project. Even Microsoft, which is not at all friendly to open source, understands the value of community involvement and released millions of lines of code for inspection under its Shared Source Initiative. (It is important to note that when Microsoft says Shared Source it does not mean open source.)

As the use of open source becomes more widespread, and more tools are developed, open source projects grow larger as well, resulting in new uses and motivations that we describe in the section Second-Generation Trends in Open Source, later in this chapter. More projects are focusing on meeting needs outside of the programming community, but even those projects, such as a server that can stream MP3 files, are usually close to the hearts of developers. Where is the open source software for creating knitting and crochet patterns? It hasn’t yet caught the imagination of a developer.

How Does Open Source Grow?

So, there the inspired developer sits, with a handful of competent developers contributing to the project, all of them working away to make the software better. How does this work? The inspired developer is now the acknowledged leader of the project, but the position doesn’t come with much authority. Frequently, no legal agreements of any kind define the relationships in the community, except for the open source software license used to declare the software’s terms of use. Usually there is a shared source code repository, perhaps a web site that is used to organize the work of the project, and an email list that is used for communication. Any rules or structure are informal and are a matter of community acceptance and voluntary compliance. Very few projects have stated these rules in writing. The Apache Software Foundation’s community process is a rare example of a formal process, but even this process must be voluntarily accepted.

Because of this loose structure, an open source project is usually more like a high school rock band in a garage than the orderly and planned engineering process used in designing complex products such as automobiles. Rock bands break up and re-form quite often before (and after) they become successful. Open source projects are the same.

As a result, an informal community culture forms. Generally, the project leader—who is usually the inspired developer, but sometimes is someone else who is more suited to the task—starts setting the agenda and making a few rules. For example, is testing important? Is backward compatibility important? Are users welcome to participate or are they an annoyance? Are decisions made by a group vote or by one person?

The structure of open source communities can be all over the map. Some have a project leader and others have a community of developers. For example, the Apache Software Foundation is a meritocracy. There are no project leaders per se, but natural leaders emerge as they gain respect from peers working on the project.

One measure of a programmer’s status in the community is the level of access he is given to the source code repository. Some developers must submit source code to the group for approval. But others are allowed to make changes to the source code on their own. This is known as being a committer, and it usually carries a high degree of status. Any programmer who is a committer on the Apache Web Server project is hot stuff among his peers.

The focus of all this community activity is on improving the software. Is there a plan or a roadmap for how the software will evolve? In most cases, the answer is no. Even in mature products used by millions of people, such as Apache, there is no written roadmap explaining what functionality will be developed in the current version and what will be added in later versions. There are just programmers writing code to make the software better.

Open source software, then, can be thought of as evolving, pulled along by the vision of the project leader, the core group of developers, and feedback from users. If the need is focused, well defined, and well understood, the software usually reflects that. If the need is unclear and vague, the software reflects that, too. This leads us to the third principle of open source:

Open source software is not planned, but evolves according to the changing values and goals of the community.

For IT developers and managers, this point is significant. It means that to understand how an open source project is likely to grow, one must first understand the shared values of the community.

It is not uncommon for an open source community to change after the initial needs are met. The pace of change and the addition of new features might slow down dramatically once the project leader and developer community have achieved their goal. At that point, one project leader might step down and new leadership might emerge to take the project in a different direction.

How Does Open Source Die?

In some respects, most open source projects never really die. Even if the original developers abandon the project, the source code usually remains available so that it’s possible for someone interested in the code to pick it up at a later date and bring it back to life. This lowered risk of losing access to an application, in fact, is one of the advantages that comes from using open source. If a proprietary company builds a product and then goes out of business, the source code might be gone forever; with open source, there’s always an opportunity for a user to pick up development himself (or fund some other party to continue development).

That being said, many open source projects do “die” in the sense that they become dormant or are abandoned by their developers. Many open source projects die, because a community never forms around the need. In the early days, such deaths were invisible. But now, thanks to the rise of public web sites such as SourceForge (http://sourceforge.net/index.php), where open source projects can be started and shared, many stillborn projects are there for all to see. For example, not much has been happening with the Skydiver’s Online Logbook project (http://skydivelogbook.sourceforge.net/).

The second way that open source projects die is that they never reach completion. The inspired developer creates some software that partially meets the original need, and then for one of many potential reasons he loses interest in the project. The inspired developer gets married, has a baby, gets a new job, has a big project at work, gets bored, starts learning guitar, whatever. For some reason, he is no longer compelled to work on the project, so there it sits, in a half-completed state. Sometimes, at some point another developer picks up where the inspired developer left off, but usually the project just languishes and the source code collects digital dust.

One common reason that open source projects die is that the community behind the project has a schism and some of the developers copy the source code to a new repository and start doing things their own way. This is known as forking the project. This story is told over and over in the open source world. The PHP Nuke project is a typical example.

With this project, developers of PHP Nuke forked into two camps. One camp wanted to make some radical changes to fix the things they didn’t like, and the other camp wanted to take a more incremental approach. The radical-change camp took a copy of the source code and started an open source project called Post Nuke. This was developed for a while and then another schism formed and a set of developers took the source code and started a project called Xaraya. All three projects are in the same area of Web Publishing and Content Management, and we cover Xaraya in detail in Appendix E. To an outsider the differences might not seem too significant, but to those involved they can be important.

Sometimes the community that creates a new project is the one that keeps the momentum going, while other times the community from the original project takes this responsibility. Sometimes both communities continue their involvement. But frequently, such community problems kill off all progress in all branches of a project. Forking is actually rare in open source circles, perhaps because of the associated trauma and risk (for more details, visit http://www.infoworld.com/article/03/10/24/42OPstrategic_1.html).

Frequently the project leader, or core group, is the one who sets the tone that either causes or prevents such problems. The dependence on the community for any meaningful progress leads us to the final principle of open source:

The health, maturity, and stability of an open source project is a direct reflection of the health, maturity, and stability of the community that surrounds it.

But remember, there is no authority declaring the life, death, or health of an open source project. If a project dies, its web site can live on. It just might be hard to tell what is happening with it.

Figure 1-1 summarizes the steps in the open source life cycle that we have described in this chapter so far.

Leadership in the Open Source Life Cycle

You can define success for an open source project as a community of developers making steady progress creating software to meet a need. Greg Stein, the chairman of the Apache Software Foundation, who has been deeply involved in many different open source projects, observes that there are two keys to the success of an open source project:

The open source life cycle
Figure 1-1. The open source life cycle
  • A clear, shared focus for the project’s vision among the developer community (i.e., strong shared values among the developer and user communities)

  • A project lead who encourages and rewards community participation

From the perspective of an IT department using open source, evaluating leadership quality is crucial, because it is such an important factor in the long-term viability of an open source project.

It is much easier to understand the concept of leadership in the commercial world than in the world of open source. In the commercial world, somebody is writing payroll checks, a formal corporate structure is in place, and people are assigned authority. People come and go, but who is in charge is not really at issue most of the time. Staff generally is motivated by getting paid, the challenge of the work, and the status and rewards of building a successful business.

Take out the formal authority and the getting paid part, and you are left with the motivation for most open source developers. They are working for the love of the craft and for the rewards of creating a successful open source project.

When a project leader is rigid, doesn’t accept and acknowledge the contributions of others, and is hostile to new directions for a project, it is much harder for a community to form.

“When someone comes to a project lead with an idea, the right attitude is to respond ‘Cool idea, why don’t you run with it’ in most cases,” says Stein. “Reconciling differences of opinion and at the same time keeping people motivated requires diplomacy, an inclusive attitude, and a generous, secure personality.”

This can be hard to achieve, given that people are working for the love of it and to fulfill their particular vision of what the software can do.

For the purposes of this book, the important point is that when evaluating open source projects, a key thing to look for is the presence of an open, accepting, community-oriented project leader. In Chapter 2, we will discuss in greater detail what specific evidence to look for.

Second-Generation Trends in Open Source

Some of the characteristics that governed the early days of open source have changed now that open source has become popular. Today, open source software is often being created not just for programmers, but also for end users. The OpenOffice project has created versions of nearly all popular desktop applications for word processing, spreadsheets, presentations, and email. Today’s open source projects are starting to compete with successful commercial projects, by developers who want to create a better solution or by commercial companies seeking to create open source alternatives for their own purposes.

It is also becoming more common for large and small companies to use open source code as the basis of applications they have built, either as products or for internal use. Sun Microsystems has been a strong sponsor of the development of open source solutions including OpenOffice, which has become an increasingly credible alternative to Microsoft Office. SAP released a version of a database it bought from another company as open source to provide a database alternative to power its enterprise applications.

The early days of open source focused on creating infrastructure that could be used to create programs. The GNU C compiler enabled the creation of languages such as Perl and development tools such as Emacs so that they could run on a wide variety of platforms. The Linux kernel itself, Apache, databases including MySQL and Postgres, graphical user interface toolkits such as GTK (which led to the GNOME desktop) and Qt (which is the basis of the KDE desktop), and hundreds of other programs have resulted in a top-to-bottom application stack that helps developers create the tools they need, from the bare metal operating system right up to the user interface layer.

The existence of this complete stack has resulted in a proliferation of applications aimed at specific groups of users. Open source has become a new way to start a software business. A developer creates an open source product, gets help from a community of developers who are interested in the area, and runs a consulting business selling services to put the open source software to work. Open source content management systems such as Bricolage and Plone (both of which are covered in detail in Appendix E) have consulting firms operating on this model. Open source products exist for ERP software, portals, data warehouses, and enterprise application integration. The maturity of the stack has created a huge opportunity for IT. Learning how to take advantage of it is the point of this book.

The Different Roots of Commercial Software

Commercial software comes into existence in a completely different way. At the core of the creative process for commercial software is a vision that many people will pay for the software being created. For open source software, the intended audience and the developer are usually the same. So there is no mystery about what the audience wants. The requirements process consists of developers deciding among themselves what they want the software to do.

Commercial software companies must somehow determine what the intended customers want. This introduces a large amount of risk into the process, because determining requirements means making lots of assumptions about whom the audience is. The economics of commercial software resemble those of a health club. Customers pay to join the club, because they can get access to a much better facility than if they built it themselves. But the club has to have the exercise machines the customers want. In commercial software, costs are shared across all customers and the company must create software that is powerful and configurable enough to solve business problems in all sorts of different environments.

Figuring out the needs of the market is a key skill for a commercial software company, and many people are involved in the process. Investors in the company, or the product marketing department, might conduct research to understand what customers want. Prototypes might be built and put in front of the target audience.

The problem for most commercial software companies is that it is fiendishly difficult to tell if they are getting it right. The ultimate test is if the software sells, and the sales staff frequently plays a key role in requirements gathering. But even if they get the first version right, the same issues of what the customers want must be revisited with every new version.

The Commercial Software Life Cycle

What generally happens with a successful software company is that the first version of the product is released, a few initial sales are made, and these customers start providing more and more information about what they like about the product.

Requirements gathering and the product roadmap

At most software companies the product management department keeps the list of potential features to be added to the product and crafts the definition of what each version will include. The product marketing department focuses on understanding what customers need (inbound requirements gathering) and then sending the message about why the product is of value to customers (outbound marketing).

The challenge in this process is that 10 or 20 segments of customers and potential customers might be providing information. The potential feature list is gathered from the developers, sales staff, product management, and product marketing. In most cases, even before the first release, the feature list contains years of work and hundreds of potential features. Each customer has his own opinion about what is most important.

The process of deciding what to do next involves several factors, among them balancing the features desired by the customers currently buying the software; adding features that might be attractive to new buyers; adding features that allow integration with other software in the marketplace; and adding features desired by other software companies that are using the software as part of their product or helping sell the software in combination with their product. In performing this balancing act, a company must ensure backward compatibility, which means that new features must not break old ones or force customers to redo work to configure or customize the product.

One way companies communicate their decisions to potential customers is through a product roadmap, which shows which features are coming along in the next version and what can be planned for future versions.

Notice how different this is from the open source requirements process. In the commercial process a group of people—engineers, product managers, product marketers, sales staff, senior managers—is trying to figure out what another group of people—customers and partners—want from the software. In the open source process there is only one group, the open source community. The community decides which features to include in each release of an open source project. And the developers decide what each individual feature should do based on their understanding of the need.

The odd shape of the commercial software feature set is caused by that attempt to balance the perceived needs of the current and future customer base, requests from important customers, recommendations from the sales staff, and features announced by competing companies. Open source features sometimes have an odd shape, because the competing and conflicting needs of developers are being balanced in a strange compromise.

Productization

One of the most challenging aspects of creating commercial software is taking a program that provides certain features and functionality, and turning it into a product. Many promising software companies fail, because they underestimate the difficulty and importance of this step.

Productization means making software work for the general case and making it as easy as possible to use. For a custom program written by an IT department, it might be fine to have an XML properties file that controls the program. For a commercial product, users will probably expect a simple administrative interface to help set the parameters. For an open source product, installation might mean unzipping the source code, compiling the program, and then figuring out how to fit it into your production or development environment. Commercial products generally have an installation program that does a lot of this automatically.

Productization requires a huge amount of work. It can take double or triple the amount of work it took to complete the original features and turn a program into a product. Here is some of the work that takes place during productization:

  • Creating administrative interfaces

  • Writing installation scripts

  • Testing features

  • Testing on different platforms

  • Performance tuning

  • Runtime monitoring through SNMP

  • Creating engineering documentation

  • Creating end-user documentation

  • Developing adapters to other programs, such as reporting software

  • Developing support for different databases and operating systems

  • Creating graphical configuration tools

  • Developing APIs

  • Developing web services

It is not crazy to think of an original development team of 5 requiring a productization team of 50 or more with various specialized skills to complete a product. What happens in most start-up software companies is that the original development team becomes the productization team, which causes two problems. First, the engineering team doesn’t like the work of productization and is not good at it. Second, it slows down development of later features.

Early in a company’s life, highly skilled customers oriented toward innovation will do without many aspects of productization as long as the functionality provided by the software is compelling (these people fall into the innovators category of customers we will discuss when we talk about the Open Source Skills and Risk Tolerance model in Chapter 3). This happens because the companies that are most likely to buy new versions of commercial products are composed of innovators and early adopters, and they have the skills to overcome the lack of productization.

But if a company does not learn how to productize its software, it is doomed when it comes time to sell to a broader market.

Maintenance and support

Once a commercial software product has been released, it is supported with patches and bug fixes as needed. It is also supported by a technical support department that answers questions, and perhaps a training and education staff. Support services can be delivered through online resources such as email, or through discussion forums or telephone support.

Customers must usually pay between 15% and 25% of the original licensing fee in annual support costs.

The presence of a support team to help solve problems is one of the most popular aspects of commercial software. It is also required, because users don’t have access to enough information about how the product works to solve problems on their own. But even if all such information were available down to the source code, IT users of software still want support. Later in this book, we will analyze the emerging category of companies that provide support services for open source software.

End-of-life

When a commercial product goes into end-of-life, it means it will no longer be improved. Bug fixes and support might be provided for a limited period, but after that, commercial software companies will stop fixing bugs and answering questions about the software.

End-of-life happens for many reasons. Perhaps the product did not succeed, or it was superseded by a newer product. Companies can afford to support only a few versions of a software product. If Version 4 is just coming out, Version 1 that was released five years ago is no longer as important to customers or to the company. Version 1 might go into end-of-life so that resources can be focused on recent versions.

Sometimes, companies put versions into practical end-of-life by raising support and maintenance fees.

Productization: The Key to Understanding the Challenge of Using Open Source

Perhaps the simplest way to understand what you are getting into by using open source is to think of it in terms of the productization idea introduced earlier in this chapter:

Using open source software means taking on the burden of overcoming the lack of productization.

Most open source projects are only partially productized. But all of the information required for you to work around the lack of productization is available.

The key questions are:

  • How large of a burden will it be to overcome the lack of productization?

  • Will it be easy or difficult for you to overcome?

  • Are the risks worth the benefit derived from the software?

The models described in the next chapters are aimed at answering those questions:

  • The Open Source Maturity model helps define the size of the productization gap.

  • The Open Source Skills and Risk Tolerance model helps gauge how hard it will be for an IT organization to overcome the productization gap.

  • The Software Cost and Risk model provides a framework for understanding the total costs, the risks, and the benefits of using an open source product.

It is no accident that the most skilled engineering teams in the country are also the largest users of open source. For them, the cost of overcoming the productization gap is small. The rest of this book will help IT departments understand the size of the gap for them.

In fairness to the open source community, we should mention another interpretation. From the perspective of a person with the required open source skills, the lack of productization is not a problem. Productization might even get in the way of a developer’s needs. From this perspective, the barrier to wider adoption is not the lack of productization, but the lack of skills in those who desire to use open source—the skills gap mentioned earlier.

Remember, productization in a commercial product is not black and white. Some companies do a better job of it than others. There will always be problems to overcome with any software, and commercial software also comes with a productization gap most of the time. In the rest of this chapter, we will look at commercial software and open source software side by side.

Comparing the Risks of Commercial and Open Source Software

The episodic, sporadic, incremental evolution of open source is in sharp contrast to the more methodical design process that most commercial projects go through. This chapter has pointed out that IT technologists and executives who are seeking to understand the opportunity open source provides must understand the nature of commercial software, which is born out of a completely different process and has different strengths and weaknesses.

But the different processes can easily obscure the nature and quality of the end product. Whether open source or commercial software is better for a particular company or a particular purpose is a complex decision that is a function of the quality of the software, how well it fits to a particular task, and the skills present in the development team. The models that we describe in subsequent chapters provide a framework for understanding these issues.

The fact is that most software, open source and commercial, has problems that must be overcome for it to be useful. Figure 1-2 shows how the best software from the open source world shares certain characteristics, as does the worst. But most software is not located at the extremes of this scale. Most software falls into a gray area, meaning it has significant problems that might or might not be showstoppers, depending on the context.

Commercial software vendors would have you think their software is perfect and without blemish. However, anyone who has bought commercial software and used it extensively finds all sorts of rough edges. This is true even for the best software from the best companies. The closer you get to a commercial software product, the uglier it looks. Most software is in the gray area, which means that in the evaluation process the nature of the software’s defects must be revealed to determine whether a program is suitable for your company. In the following discussion, we will examine some important differences between open source and commercial software that affect the evaluation process, and differences in the risk of owning and operating each type of software.

The Sales Process

As we have pointed out repeatedly, open source software generally doesn’t come with a salesperson to guide you through the process of learning how the software might help you. While salespeople can be a tremendous help in gathering information about a piece of software, even the most naïve among us knows that all the information comes with a strong positive bias. Whitepapers, references, case studies, and so on, all paint a rosy picture. Software companies spend millions of dollars to influence the opinions of third-party analysts. Much of the most useful information, such as product support databases, is not available until after purchase. Sales staff is seldom rewarded for providing reasons not to buy the software. The sales process is generally of little help in getting to the key problems of software in the gray area.

Shared characteristics of open source and commercial software
Figure 1-2. Shared characteristics of open source and commercial software

With open source, much more information about the software is available on the product’s frequently asked question (FAQ) lists and bulletin boards. Much of the discussion on these forums actually concerns the problems of keeping an open source project in the gray area. The information will not come to you, but in most cases it is there for you to find. And you must be careful about accepting such information as authoritative. Some open source project participants have anonymously and systematically posted negative information about competing projects.

Transparency

With a commercial product, the company controls most of the information. The claims for product features might or might not be backed up by solid code, especially when it comes to new features. Customers who have run into problems usually don’t post this information for public display, to avoid harming their relationship with the company or to avoid legal retribution. Independent user groups also usually want to keep good relations with the company. And user-group information is frequently available only after you are a customer anyhow.

In an open source project, the entire sausage factory is there for you to see, in all its complexity and ugliness, from day one. The chief barrier to getting the information is the time it takes to rummage through source code, bulletin boards, and so on. Search engines have made this process much easier. The most annoying and controversial bugs are usually not hard to find.

The ultimate test of whether an open source software product will work for your company is to install it and try it out. With commercial software, this is not always an option, and one that frequently costs money when it is an option. In open source, the costs are measured in terms of time and effort, but the opportunity to test the software is always available. While it might require effort, with an open source project there are no barriers to finding the problems in the gray areas.

Flexibility

There are well-defined ways to extend a commercial product. APIs are defined to allow users to write programs using the functionality of the applications. If the APIs allow everything you need, all is well. But if the product doesn’t expose some deep, inside functionality in the right way, it can be years before a requested change becomes part of the product. The company’s view of how many other customers want the same type of feature determines its decision to incorporate the requested change.

In open source, there is no such barrier. If you want to use deep, inside functionality of the software, you can write your own APIs to get access to it. Of course, this requires time, effort, and expertise, and for a large program it is hard to know if such deep surgery will have unintended side effects. But you have nobody to ask. You must simply do the work.

Risk of Quality

It is not uncommon for a commercial product to be interesting because of a new feature. That is why new features are created—to attract customers. But how well does that new feature work? With a commercial product, it is hard to tell until you buy it, mostly because of the issues mentioned already. And if you encounter any problems, you can’t fix them.

In an open source project, a new feature might be implemented to spur discussion. The problems with the new feature might be announced on the project web site, without shame. The idea, of course, is that the developer might not have the time or skill to get the feature right. In an open source project, everything is open to improvement, if you have the time and skill. Figuring out what skill level you possess and what kind of tasks you can handle is the goal of the Open Source Skills and Risk Tolerance model.

In an open source project, it is generally possible to access the knowledge of senior programmers. In healthy open source projects, intelligent questions that are crucial to determining the product’s future, or fixing new bugs, are addressed by the most senior developers on the project. With commercial software, access to senior engineers is difficult for an external company to get.

Risk of Productization

As a general rule, commercial software is more productized than open source software. Installation scripts, administrative interfaces, and documentation are usually better for a commercial product than for an open source product of the same age.

This chapter argues that to succeed with open source software, it is wise to plan for overcoming the lack of productization. It is the fundamental argument of this book that developing such skills can pay great benefits.

Risk of Failure

Both commercial software vendors and open source projects can crash and burn. Commercial software companies almost always try to protect customers by putting their source code in escrow, so it will be available should the vendor go out of business. The problem is that escrows can become out-of-date. It requires a lot of effort to keep an escrow current, along with everything needed to create the software. Does the escrow match the software at every customer site? How can it be made to match? Who would do this work?

In an open source project, a working copy of the source code is in the possession of the company using the software, from day one. This copy of the software is not out of sync, and it has everything needed to change, recompile, and assemble the working program. If an open source project tails off, those using the software are at a much lower risk.

Risk of Takeover

In the world of commercial software, a takeover can change everything. If a larger company buys a smaller competitor, the customers of the smaller company might eventually face a forced migration to the larger company’s product. In a recent series of shocks to the IT industry, PeopleSoft bought J.D. Edwards, only to be acquired in a hostile takeover by Oracle. It’s not hard to imagine how worried the original J.D. Edwards customers are. Sometimes, an acquiring company bends over backward so as not to alienate its new customers. But in this case, we’ve heard horror stories concerning loss of support and forced migration.

Open source projects are similar to takeovers in that both have forks, whereby a group of developers takes the source code and moves it in a different direction. A fork can take people’s attention away from coding or hurt morale. But unlike what happens in some takeovers, nobody is forcing work to stop.

Support

Support is perhaps the biggest advantage of commercial software. The 15% to 25% of the license fee paid every year funds a staff of engineers and support technicians who are on call to help when you have a problem. This is of great value to most companies and is a great comfort if the support is well organized and of high quality.

Some open source projects provide excellent support to their users; sometimes the quality of that support exceeds that of commercial products. But in a broad sense, while commercial software has clear channels of support, open source support comprises figuring things out on your own.

With open source projects, large collections of information can help with some elements of support, but for the most part, you are on your own. This is a scary prospect for all but the most skilled programmers. Some companies have stepped into this gap and are offering support services for popular open source products. One of the fundamental arguments of this book is that companies who become skilled enough to be able to support open source have much to gain.

Looking back on this list of risks, we can see many risks and responsibilities for both commercial and open source software. With commercial software, the customer pays the vendor to manage the risks. In the world of open source, you must manage the risks yourself. By the end of Chapter 5 of this book, you will be an expert at determining and comparing these risks.

Get Open Source for the Enterprise now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.