The combination of Oracle Corporation and open source software may appear to be an unlikely pairing. What could Oracle, with its history of ruthless competition, intense marketing, and cutthroat corporate life, have to do with the collaborative, altruistic, and apparently anti-corporate world of open source?
The answer, surprisingly, is quite a lot. In recent years, the gospel of the open source movement has spread far and wide, reaching even the corporate corridors and product lines of organizations like Oracle Corporation. Consider the following recent developments:
Oracle8i has been officially ported to the freely available Linux operating system.
Dozens of excellent applications written by open source developers—Orac, Oddis, Karma, Oracletool, OraSnap, Big Brother, jDBA, GNOME-db, and many more—give Oracle database administrators and developers new tools for managing their databases and building new applications. And if one of these tools doesn’t do exactly what’s needed in a specific environment, the source code can be modified without restriction.
In our opinion, this new synergy between the corporate world of Oracle and the freewheeling world of open source is a great thing—the blended products we’re starting to see truly do represent the best of both worlds.
The purpose of this book is to share what we know about this blending of Oracle and open source. It’s a new world, and one that hasn’t been examined much to date. Although many books have been written about some of the base open source technologies we’ll be exploring in this book (for example, Perl, Tcl, and Apache), there has been very little written about the way these technologies are used with Oracle databases, and even less about most of the Oracle open source applications available to DBAs and developers. Although we can’t possibly describe every open source technology in depth within the confines of this single volume, we’ll try to provide a foundation. Our overriding goal is to shed light on as many of the open source technologies and applications as possible and on how they communicate with Oracle databases. In trying to achieve this goal, we’ll weave together three distinct threads:
We’ll explore the major open source technologies on the current computing landscape—Perl, Tcl, Python, Apache, GNOME, GTK+, and even some open source corners of Java—and explain how they are being used to build Oracle applications and connect to Oracle databases.
We’ll describe the best of the Oracle applications that are currently available for you to download, use, and, if you wish, modify. Some of these applications are tools for performing database administration tasks; others provide frameworks for application development of your own.
We’ll try to motivate you to consider writing your own Oracle open source applications in the future.
The world of open source software is a wide-ranging one. In this book, we’ll focus on the open source technologies that are most often used by Oracle developers:
We’ll focus on Perl, Tcl, and Python, the most popular languages and the ones providing the most solid connections to Oracle databases. Chapter 2, and Chapter 3, describe how to obtain and install the base languages, how to use their companion graphics toolkits, and how they communicate with Oracle. Chapter 4, describes two excellent open source applications: Orac, built on Perl, and Oddis, built on Tcl. Both are good examples of how you can build Oracle-based graphical user interfaces (GUIs) with open source scripting languages and toolkits.
We’ll next look at an alternative to GUI applications—Oracle applications that use the Web, rather than a GUI, as the user interface. We’ll focus on the Apache web server, CGI programming with Oracle, and the use of such embedded scripting tools as PHP and EmbPerl. Chapter 5, describes these open source technologies and how they connect to Oracle databases. Chapter 6, describes an assortment of excellent web-based applications that can be used for Oracle database administration: Karma, Oracletool, OraSnap, DB_Browser, PhpMyAdmin/PhpOracleAdmin, WWWdb, and Big Brother.
You may be surprised to see any coverage of Java in this book. Java is, after all, still a proprietary language of Sun Microsystems. Many open source developers are now building Oracle-based applications in Java, however, and Oracle is providing extensive support for Java connectivity. Chapter 7, describes the basics of Java DataBase Connectivity (JDBC), which is used to connect programs to Oracle databases. It also describes the open source JDOM servlet API and some other Oracle-related aspects of Java, including the Apache JServ web server, now distributed with Oracle’s Internet Application Server, and how you can use it to build Oracle Enterprise Java servlets. Chapter 8, describes several excellent Oracle open source applications based on Java: jDBA, ViennaSQL, and DBInspector. It also explains how you can obtain and install DB Prism, a servlet-based tool for use with Oracle’s PL/SQL and XML code facilities.
GNOME is an ambitious project aimed at creating a complete, open source desktop environment for Linux. Now that Oracle8i has been ported to Linux, GNOME and its excellent GTK+ graphical toolkit are available to Oracle developers. Chapter 9, describes the basics of GNOME and GTK+ and how you can use them to build graphical applications for Oracle. Chapter 10, describes the excellent Orasoft applications suite, a full-functioned database administration toolkit for Oracle DBAs, as well as several other GNOME/GTK+ applications that communicate with Oracle databases: GNOME-DB, gASQL, Gnome Transcript, and Gaby.
Although the sequence of chapters is roughly chronological (Perl and the other scripting languages were available before the coming of the World Wide Web, which preceded the development of Java and GNOME/GTK+), all four basic areas are still vibrant and fertile areas of open source activity, and all are focal points for Oracle application development. To a large extent, the choice of which open source technologies to use for developing an open source application is simply personal preference. Do you want to write in Perl, C, or Java? Do you want your user interface to be GUI or web-based? Fortunately, regardless of your choice, there is excellent Oracle connectivity available with all of the technologies we describe in this book. In subsequent chapters, we’ll focus on this connectivity. As interesting and wide-ranging as these core technologies are, we’ll try to limit our discussion to what you need to know in order to use them with Oracle, in an effort to keep this book a manageable size.
We’ve mentioned a lot of open source programs already, and we’ll describe even more in this book. Table 1-2, included near the end of this chapter, provides a full list of the programs described, along with URLs for further exploration. (We’ll also try to keep the URLs current at the O’Reilly web page for this book—see the preface for details.)
Before we get into the details of how you can use and build Oracle open source software, let’s take some time exploring what open source is and why you might want to use it.
The quick and dirty definition of "open source” is that it is software that’s freely available: you can acquire it freely (it’s usually downloaded from the Internet at no cost), and you can modify it freely (the source code is provided, not just the executable files). However, it’s important that we refine this quick use of the word “free.” There are important semantic differences between what the Free Software Foundation (FSF) defines as “free” and what the newer Open Source Initiative defines as “free.” We’ll explore both movements and their respective definitions later in this chapter. For starters, we’ll just note the functional definition of “open source” from the Open Source Initiative web site, at http://www.opensource.org:
Open source promotes software reliability and quality by supporting independent peer review and rapid evolution of source code. To be OSI certified, the software must be distributed under a license that guarantees the right to read, redistribute, modify, and use the software freely.
The OSI web site continues with a description of why open source is of such high quality:
The basic idea behind open source is very simple. When programmers on the Internet can read, redistribute, and modify the source for a piece of software, it evolves. People improve it, people adapt it, people fix bugs. And this can happen at a speed that, if one is used to the slow pace of conventional software development, seems astonishing. We in the open-source community have learned that this rapid evolutionary process produces better software than the traditional closed model, in which only a very few programmers can see source and everybody else must blindly use an opaque block of bits.
A little later in this section, we’ll discuss the open source license and the details of what it means to be OSI certified. For now, let’s ask a few more questions.
Many people, accustomed to the traditional world of commercial software, are puzzled by the whole notion of open source. Why would anyone give away something of obvious value? Because they don’t understand the motivation, they tend to distrust the product. They ask:
How can open source software be any good if its author doesn’t value it enough to charge for it?
How can it be trusted to perform if it doesn’t have the weight of a business behind it?
How can it be responsive to users’ needs if users don’t have any contractual claim on it?
By even asking these questions, though, people are ignoring the clearest success stories of open source: Apache, the world’s dominant web server; Perl, the world’s dominant scripting language; and Linux, the world’s most rapidly growing operating system kernel. All of these pieces of software have already been enthusiastically accepted by the corporate world as stable, successful, and, if not the best solutions in their fields, then at least major challengers. Many people seem to have a blind spot; they assume that because this software has, in fact, received commercial acceptance, it therefore can’t be open source. But Apache, Perl, Linux, and the many other excellent pieces of software described in this book are indeed open source, through and through.
In the next few sections we’ll look briefly at what open source is all about and why you, as an Oracle DBA or developer, might find it useful. Then we’ll shift our focus to the main open source technologies available today, how they’ve been used to communicate with Oracle databases, and how you can use them to build your own Oracle-based applications.
Considering the enormous amount of software that’s available to you from Oracle Corporation and various other software providers, why should you, as an Oracle DBA or developer, care about open source software?
Open source software is free in several senses of the word. It is typically free of cost, which means that you don’t have to pay for it. It is also free of code restrictions, which means that the source code is provided, and you are free to modify it. Let’s look at these two characteristics in turn and consider their relevance to your life with Oracle.
Oracle DBAs and developers typically work in corporate environments. The stereotypical user of open source software is the guy in the garage, with no salary, no budget, and no deep corporate pockets. The typical Oracle DBA isn’t in that category; he often has a reasonably generous budget for software. What he might not have, however, is the flexibility to use that budget as he sees fit. If you need a piece of software right now, but you can’t buy it without going through a lengthy approval process, then those corporate pockets won’t seem so deep after all. You might as well be in the garage.
We’re overstating the point, of course, but clearly the ability to simply download and try out many different possible tools before settling on one is enormously beneficial.
The second characteristic of open source, the ability to customize the source code to suit you, is often even more important. Most Oracle DBAs and developers are in a hurry. They are constantly fighting fires and trying their best to support the disparate needs of the users in their organizations.
The rapid advancement of technology, combined with the ever-increasing needs of an interconnected world, make the life of an Oracle DBA or developer more complex than ever. At one time, a DBA might have been able to do her job using a single, straightforward administration tool, combined with a small system of character-based entry forms. No more: the need for 24 × 7 availability, distributed processing, data warehousing, and ever faster response times has led to a bewildering array of products and responsibilities. Web connections, telecom billing, secure payments, and a host of other problems abound. It is the lucky Oracle DBA these days who is not weighed down with a GSM mobile, a pager, and a Palm Pilot filled with difficult assignments scheduled for particularly unsociable hours, or the lucky Oracle developer who has already mastered all the latest web development tools, Java servlets, and XML data parsers. Every site is different, and every user’s problem is a new one. An enormous advantage of open source software is that if it doesn’t do what you need, you can adapt it.
Although “open source” is a relatively new term, its origins date back to the early days of computer software. And the more general concept of gaining value by giving value is one that predates computers themselves.
Most monetary economies in the corporate world are driven by some type of control mechanism or exchange mechanism (or, to be less technical, by the carrot and the stick). Let’s suppose that you are a dedicated programmer working for largeCorporationWidgets.com, and your boss tells you to code up an Oracle Forms program or set up a database. You generally do as you’re told. Why? Because you have previously agreed to be under the control of the company in return for various company benefits. The “stick” side of the equation is that if you fail to carry out an assigned task, you won’t be paid and you may lose your job. On the “carrot” side, you will receive a salary in payment for doing your job, and eventually you may earn a promotion.
Until recently, these complementary rationales were the only generally accepted methods for driving innovative software creation (and for motivating dedicated programmers, most of whom have the natural herding instincts of paranoid tigers). However, hiding behind this control and exchange mechanism are two implicit assumptions:
largeCorporationWidgets.com controls all access to computing power and the network. If they fire you, you will no longer be able to write programs or communicate with any of your programming friends around the world. If you’re a dedicated hacker, this may be important to you.
You need the money.
In contrast, let’s look at an alternative, rather plausible universe in which these two assumptions don’t necessarily hold true. Suppose you’re a gifted hacker at MIT who is happy living upon a secure research grant; or you’re a well-paid professor of computer science at Stanford who wants to attract the brightest undergraduates; or you’re a dedicated nighttime hacker, who makes enough during the day (possibly supporting legacy systems for largeCorporationWidgets.com) to pay for the latest workstation and a 24-hour, high-speed Internet connection. In all of these cases, you find yourself with an abundance of computer-related material goods, networks, and processing power, plus enough money to eat and pay the rent. When it comes right down to it, you simply don’t require any more in the way of material possessions (computer or otherwise). In none of these cases is there any outside force that can persuade you to work any harder.
Given that money isn’t a sufficient motive in these cases, what then could motivate you to write those programs? There are a number of possible motives:
The need to achieve the highest possible status among your peers
Pure scientific curiosity and the need to write programs that are interesting and fun
The need to solve problems that are hampering your ability to do your job (in the open source world, this is known as “scratching a developer’s itch”)
Let’s look at these three motives in turn.
The gift culture is an anthropological term that Eric Raymond, a major force in the open source community (we’ll discuss his role a bit later in this chapter), first used to describe the culture of open source. In the world of software, the idea of a gift culture is, basically, that if you give away cool programs that do useful things for other people, then your stock will rise among your colleagues. The degree of regard in which others hold you will depend exponentially upon the quality of your gifts. This last point is the prime engine that drives the open source movement. Raymond writes that, in a gift culture:
Participants compete for prestige by giving time, energy, and creativity away . . . In gift cultures, social status is determined not by what you control but by what you give away.
Thus, when Larry Wall gives Perl (essentially, the Mount Olympus of open source software) away to the hacker community, he rises to a transcendental plane occupied by only a few other deities. This pantheon is an extremely exclusive club, and no amount of money will buy you a membership card. You have to earn membership another way—by giving away the virtual crown jewels of your programming creativity.
The whole notion of the gift culture may seem strange to you, but note that the concept is not specific to the world of computing. Raymond writes:
Gift cultures are adaptations not to scarcity but to abundance. They arise in populations that do not have significant material-scarcity problems with survival goods. We can observe gift cultures in action among aboriginal cultures living in ecozones with mild climates and abundant food. We can also observe them in certain strata of our own society, especially in show business and among the very wealthy.
There is a human need for progress, innovation, and building things of value that is by no means unique to the world of computing. Most scientists tend to gravitate to areas that intrigue them, for whatever reason, and a surprising number choose to focus on these areas even if there is little chance of immediate financial reward.
Consider Albert Einstein. Einstein spent many years studying on his own a topic of great interest to him: a comparison of James Clerk Maxwell’s idea (circa 1865) that light traveled at a single relative fixed speed to Max Planck’s alternative ideas (circa 1900) on constant light energy quanta. Einstein had ample opportunity to perform lucrative pre-World War I physics research for any one of the warlike governments of Europe, but he chose to stay ensconced within the quiet of his Swiss Patent Office in Berne. With no particular promise of financial reward, he was determined to resolve the problem of what would happen if you could travel upon Maxwell’s constant-speed lightbeam in Planck’s quantized universe.
In the world of computer software, we see a similar process. Although much innovation is directed by the promise of financial reward, a good deal of energy goes into developing software that is simply interesting to an individual developer. Some of that development leads to blind alleys, of course, but in the best cases, developers create wonderfully innovative programs that ultimately become virtual standards.
There are a great many interesting technical problems to solve and a great deal of challenging software to be written. How do open source developers decide how to spend their time? Most often, they start out trying to solve a particular problem or speed up a process that is getting in the way of doing their work in the most efficient way.
As Eric Raymond puts it, “Every good work of software starts by scratching a developer’s personal itch.” An individual developer is unsatisfied with the capabilities of a particular piece of software, so he decides to extend the program to suit his own needs. Maybe it turns out that other developers can benefit from his work. Ultimately, that extension, or modification, gets added to the general community of programs, just as in traditional science, one scientist’s new theory gets added to the canon of established science. The most functional and usable programs become the foundation stones for the next generation of software. Since so much open source software is built on what has come before, no self-respecting open source developer would even contemplate keeping his innovations secret or making them too expensive for his peers to acquire. Most such developers see themselves as being only the temporary torch bearers in an unending chain of runners. To fail to pass on the baton would be to forfeit all respect.
In a similar way, all scientists build upon the great ideas of the past. However creative and revolutionary Einstein’s work was, it relied in part on work that had gone before. Without Maxwell’s mathematical equations, published in the public domain for all to see, Einstein would not have been able to build upon them, and physics could have languished for a much longer time under the older Newtonian “optics” model. And, continuing in the scientific tradition, Einstein’s work built a foundation for others to work on and modify in the future.
A good example of this progression in the computing world is the Linux operating system, which we’ll look at in somewhat more detail later in this chapter. Linus Torvalds took the ideas contained within Andrew Tanenbaum’s Minix operating system and used them to create his own kernel. Torvalds’ original motivation was an “itch,” the desire to read Usenet news groups without becoming reliant upon the binary-only operating systems that at that time were the only systems available on PCs. Torvalds published his early efforts in 1991, and the Linux project rapidly grew to become the poster child for open source. However, Linux success would never have evolved as it did without the availability of the source code for Tanenbaum’s Minix. And the many creative developments built on Linux—for example, the GNOME/GTK+ technologies we’ll look at in Chapter 9 and Chapter 10—would probably never have come into existence.
It may be useful to take a look back in time to see how we got where we are today. The concept of open source didn’t just arrive, without precedent, on the computing scene. In some ways, as we’ve seen with Einstein, the basic motivations are as old as science itself. However, you’ll be relieved to know that we’ll confine our journey to the world of computing. Obviously, in this short section we can only touch on many events that are more complex and interesting than we can possibly express here. We apologize for the necessary brevity and oversimplification.
To start our journey, let’s go back all the way to Alan Turing in 1945. After helping break the German military codes (including the infamous Enigma) in World War II, Turing began working for the National Physical Laboratory (NPL) in Britain on a project known as the Automatic Computing Engine (ACE). Although ACE wasn’t the success Turing had hoped for, his ideas on artificial intelligence (such as the famous Turing test) permeated the NPL. Turing had a strong influence on Donald Davies, whose 1966 “Proposal for a Digital Communication Network” would later play a crucial role in the development of the ARPAnet in the United States.
With some cross-fertilization between NPL and the Massachusetts Institute of Technology, Turing’s ideas on AI finally took root in 1956 under the leadership of John McCarthy (the inventor of Lisp) and Marvin Minsky, and culminated in the founding of the MIT Artificial Intelligence Project in 1960. A decade later, in 1971, the AI Lab at MIT took on perhaps its most famous student to date, Richard M. Stallman, thereby setting in place a chain reaction of crucial events that culminated in the GNU/Linux operating system. Before we get there, however, we need to look back at a second historical thread.
In 1958, the United States government created the Advanced Research Projects Agency (ARPA). ARPA gradually began to take on much of the “blue-skies” computer research for the Pentagon and the rest of the U.S. defense establishment. In 1962, J.C.R. Licklider arrived at ARPA from MIT and put in place the research processes for networking and time-sharing operating systems. By 1969, these systems had come together into the ARPAnet project, which was the direct lineal ancestor of today’s Internet.
The key to the ARPAnet’s revolutionary communications system was a distributed, fault-resilient network with multiply redundant nodes. The idea was that, even if many of these nodes were damaged, messages would still be able to get through via whatever path remained. This model, originally developed by Paul Baran, was initially rejected by AT&T (which at the time supplied many of the American government’s communication requirements), because it did not match their more centralized analog systems. However, the NPL scientist Roger Scantlebury later rediscovered Baran’s work and combined it with Donald Davies’ research. The result was the basic notion of packet switching , a concept on which the entire Internet rests. The ARPAnet subsequently went from strength to strength, eventually evolving into today’s Internet. Before we jump too far ahead, though, let’s look at the third thread in the interwoven origins of open source.
AT&T originally resisted Baran’s multiply redundant and distributed network. However, in 1969, at the Bell Labs province of the AT&T empire, Ken Thompson invented a new operating system called Unix (the name was allegedly a pun on the earlier MULTICS system). Together with Dennis Ritchie’s new C language, Thompson went on to make the Unix operating system truly portable by adapting it for a wide range of machines and creating portable Unix-based programs via the C programming language. The fact that the source code for Unix was also openly distributed to a large number of universities and institutions (the University of California at Berkeley played a particularly important role) helped with the operating system’s portability. This was especially significant at the time because AT&T had been blocked from entering the computer business by a U.S. government antitrust measure in 1956.
From 1974 on, staff at Berkeley and Bell Labs worked in close cooperation to improve the original AT&T System V Unix (mainly on the Digital Equipment Corporation PDP-11). The operating system quickly became a favorite of hackers because of its special approach: it started with simple robust tools, which fed into more complex ones via the Unix pipe concept. In 1977, Bill Joy (later of Sun Microsystems) put together the first so-called Berkeley Software Distribution of Unix (BSD), which he worked on until 1982, when he joined the fledgling Sun Microsystems.
As Unix became more widely used—due particularly to its portability—it became the standard operating system for the ARPAnet in the early 1980s (which is why web addresses now use a forward Unix slash rather than a backward one). The promised land of open standards and source code distribution was now almost in sight: the computing world was on the verge of having a networking system (ARPAnet), an operating system (Unix), and a programming language (C), all of them relatively free, and all of them working nicely together.
Unfortunately, defeat was snatched from the jaws of victory through a failure to overcome two unforeseen road hazards: the high price of Unix workstations at the time and the Ma Bell corporate breakup.
The first obstacle to early domination by the forces of free software was the price of Unix workstations. They were within the price range of universities and corporations, but they weren’t cheap enough for individuals. Ultimately, failure to lower the price led to the rise of relatively inexpensive, non-networked personal computers. Over the next 20 years, PCs with their plethora of proprietary software came to dominate the computer industry, pushing aside the lower end of the more technically advanced Unix workstations.
The second blow to early domination by free software came in 1984 when AT&T reached an agreement with the U.S. government to divest itself of its Bell telephone operating companies, in return for having the government lift the anti-monopolistic constraints of its 1956 decree. This development meant that AT&T could enter the computer business for the first time. Now, instead of giving its System V Unix operating system away, AT&T was able to exploit it commercially. System V Unix became a proprietary product just like any other one! Even the BSD-related flavors of Unix were suddenly available only via Unix vendors and the increasingly expensive AT&T source license.
Hackers around the world were appalled at the fate of Unix—but none more than Richard M. Stallman. (Most people regard Stallman as the grandfather of the open source movement, though he clearly separates himself from it by his parallel involvement with the Free Software Foundation, as we’ll see later in this chapter.) A brilliant and creative programmer who came to MIT as an undergraduate, Stallman started working at the MIT Artificial Intelligence Laboratory in 1971, and basically never left. Stallman was part of the early hacker culture at the AI Lab and a fierce advocate of free software. Although many of his colleagues left MIT over the years to pursue commercial ventures (many of them joining Symbolics), Stallman continued to rebuff the commercial world. With the 1984 commercialization of Unix, he dug in for a fight. He decided to build an operating system—essentially, a free Unix—from scratch, forming the GNU Project.
Before the dragon-slaying against AT&T’s Unix could begin, however, Stallman needed to create a new type of software license in order to protect his forthcoming work. In 1985, he created the Free Software Foundation (FSF) and, through the foundation, sold copies of his software. Many people find Stallman’s term “free software” confusing, and they are surprised that such software can be used to produce revenue. Most assume that the word “free” describes the price of the software. Stallman himself has provided one of the best statements on this topic:
Free software is a matter of liberty, not price. To understand the concept you should think of `free speech’ not `free beer’.
In addition, for a more detailed look at the rich and interesting story of the GNU Project and the Free Software Foundation, see:
In summary, the “copyleft” concept of the GNU General Public License says that you are free to download a program protected with the license (termed a GPL program), receive source code, and change that code if you want to. However, from this point on, whenever you distribute this program—whether you’ve modified it or not—you must give the recipients the same rights you had when you received the original. In other words, you cannot take GPL programs and then make them proprietary or binary-only, granting all of the subsequent rights to yourself. You must give away these rights, including rights to the entire source code—even if you embed only the tiniest piece of GPL code within your program code—and even if your program is a much larger, previously proprietary body of work you wrote entirely yourself. This notion turned out to be a very controversial one, and one that businesses were loath to accept. As we’ll discuss later (see “Open Source and the Commercial World”), taking a somewhat different approach to modern-day open source licensing has made the notion of open source more palatable to corporations. (There is another GNU license, the LGPL, or lesser GPL, which is not quite so rigorous. We’ll discuss this license, particularly in relation to the GNOME Project, in Chapter 9.)
With a dedicated GNU Project team around him, Stallman spent the rest of the 1980s at MIT, gradually building the large set of tools needed to craft a truly free operating system. By 1990, the GNU system was almost complete, except for one final element—the kernel, the beating heart of any operating system. The GNU Hurd project had long attempted to build this kernel, but they were still several years from completing it.
Fortunately for our story, the seventh cavalry, in the shape of Linus Torvalds, arrived from Finland to save the day. He bore his trusty Linux kernel in place of a squadron of Winchester rifles. Linux came from an open tradition of its own, so its kernel was readily adopted by the GNU Project as the final part of the new operating system. By 1992, the first version of what became known as the GNU/Linux operating system was in place.
Perhaps one of the most important developments emerging from the GNU Project was the development model that Linus Torvalds adopted for Linux. In contrast to the typical corporate development project, Torvalds took a different approach to the standard technique of hand-picking a few colleagues to work on a project with rigidly designed goals. Instead he adopted a “release early, release often” approach, which involved hundreds (if not thousands) of developers helping out in many disparate ways. There was something for every volunteer to do—from documentation to bug-chasing to hardcore coding of the kernel itself. The fact that a release might contain bugs wasn’t seen as a failure, but instead as a challenge for the many enthusiastic volunteers. This is now recorded in the memorable phrase, “With many eyes, all bugs are shallow.”
Frequent releases of the Linux product, bugs and all, were facilitated by CVS (the Concurrent Versions System, http://www.cvshome.org), which allows multiple module authoring and provides the enhanced debugging facilities required for mass-development projects like Linux. CVS is now the dominant development tool for open source projects. The CVS development model is mainly democratic, though most successful projects involve one benign dictator (such as Linus Torvalds) at the center of the controlled chaos, calling the most important shots and settling the inevitable arguments. As you might expect, the success of this development model has baffled corporations that are accustomed to the top-down style of authoritarian software development.
The late 1980s and early 1990s saw the emergence of a number of superb pieces of open source software that have had an enormous impact on the computing world. We’ve already touched on the Linux operating system, possibly the greatest open source achievement to date. Others we put in this “best of” category (especially taking into account their potential for use with Oracle) include the Perl, Tcl, and Python scripting languages and the Apache web server. (Of course, there is a lot of excellent software out there, and others may disagree with our choices.) We’ll briefly discuss each of these in Section 1.1.5. In this book, we’ll focus on how each of these open source tools is used to build Oracle applications; their scope is much wider, however, and we encourage you to learn more by consulting the references we’ve collected in Appendix C.
Throughout most of the 1990s, the high quality of open source software like Perl, Tcl, and Linux became more and more apparent even to the corporate world. Inside every company that followed the typical corporate development model were increasing numbers of individuals who downloaded, experimented with, and raved about both open source tools and open source approaches to building software. But even companies that were convinced of the quality of this software were suspicious of the stronger GNU Public License and the libertarian stance of Stallman and others in the Free Software movement.
As the decade progressed, some open source enthusiasts struggled to find a way to reconcile the worlds of free and proprietary software. They believed that the open source and commercial worlds didn’t have to be at war, and that free and proprietary software, by coexisting, could benefit both sides. They wanted to figure out how to remove the stumbling blocks of the past and give the corporate world a way to use and contribute to the growth of open source software. Through a combination of savvy public relations and a new licensing paradigm, the “new” open source movement has had notable success.
Chief among the new open source pragmatists was Eric S. Raymond. Raymond, who wrote The New Hacker’s Dictionary back in 1990 and subsequently became something of an anthropologist of hacker culture, has been a major player in the present-day open source movement. In addition to writing software, Raymond had been writing for years about open source history and philosophy. His 1997 essay, “The Cathedral and the Bazaar,” was very well received by the developer community and fueled wider interest in open source. In that essay, Raymond compares the typical commercial development method of the cathedral (where many organized master masons gradually construct a pre-planned and religiously awesome monolith) to the bazaar of Linux (where many apparently unorganized bricklayers build a fluid market of ideas and sustainable growth, which gradually becomes an unstoppable force).
In January of 1998 came the open source equivalent of “the shot heard ‘round the world.” Netscape announced that it would release the source code for its client product line, and CEO Jim Barksdale credited Raymond with being the “fundamental inspiration” for the decision. In the wake of this event, and the additional media interest generated by the March 1998 O’Reilly Free Software Summit, which brought 20 open source leaders together, Raymond effectively became the media spokesman for the movement. Gregarious and media-savvy, Raymond was the personality the media sought out to explain what this open source stuff was all about.
Raymond and others in the open source community were pleased by Netscape’s announcement and the media attention, but they wanted to be sure that they took full advantage of this moment in history. They wanted to see more corporations release source code and use open source programs within their own products. They felt that the future of open source programming would be in peril if the outside world continued to regard the open source movement as anticommercial.
One obstacle to commercial coexistence with open source in the past was the type of licensing attached to open source software. Back in the 1980s, corporations that wanted to use GNU Emacs and other tools from the Free Software Foundation were stymied by the rigorous GNU Public License. That legal document was a major impediment to official corporate adoption of GNU software. The new open source movement (which was starting to coalesce into the Open Source Initiative, OSI) wanted to find a way to protect open source software, while giving the corporate world some incentive to use and add to the pool of open source software.
The emerging Open Source Initiative looked at a number of previous types of free software licenses, including the GNU Public License, the X Consortium License, and the Perl Artistic License. They wanted to pull together the best aspects of these licenses and create a “software bill of rights.” The document they came up with drew heavily from the Debian Free Software Guidelines written by Bruce Perens and was first published in the summer of 1997. The idea was that an open source program that met the definition would be “OSI certified.” The Open Source Definition provides the imperial standard for protecting and certifying the quality of that software. A computer user or software developer who downloads a particular piece of open source software can know exactly where he stands in relation to a defined legal yardstick. We’ve included the text of the Open Source Definition at the end of this chapter, along with some additional explanation where we thought it would be helpful.
You can also read Bruce Perens’s essay on the definition at:
Like another pentagon of stars in a bright circlet constellation, there are five prominent open technologies that have gained the most attention over the past decade. Table 1-1 lists those technologies and web sites where you can obtain more information about each.
Table 1-1. The Open Source Crown Jewels
We’ve discussed Linux a bit already. In the following sections we’ll look briefly at each of the other technologies; you’ll learn much more about them all in subsequent chapters.
Perl (Practical Extraction and Report Language) was developed back in 1987 by Larry Wall as a way of making things easier—originally, for performing his own system administration tasks, and ultimately for a whole generation of developers. Perl is an interpreted scripting language that combines the best capabilities of a variety of other languages, but the whole of Perl is far greater than the sum of its parts. Perl was designed especially to be:
Perhaps the most accurate summary of what Perl is best at can be
found in the
README file written by Wall for
Perl Version 1.0:
Perl is an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It’s also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). It combines (in the author’s opinion, anyway) some of the best features of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it. (Language historians will also note some vestiges of csh, Pascal, and even BASIC|PLUS.) Expression syntax corresponds quite closely to C expression syntax. If you have a problem that would ordinarily use sed or awk or sh, but it exceeds their capabilities or must run a little faster, and you don’t want to write the silly thing in C, then perl may be for you. There are also translators to turn your sed and awk scripts into perl scripts. OK, enough hype.
Since it was first released, the language has gone from strength to strength. Over the years, an enthusiastic and partisan army of Perl volunteers has extended the language in a myriad of ways. CPAN (the Comprehensive Perl Archive Network), an online repository of Perl core files, documentation, and contributed modules, has become a model for an open source development community. Perl has grown to be the “glue” language of the Internet and is ideally suited as a language for developing web applications and system management tasks and for allowing diferent systems to work well together. Perl 4 brought the release of modules allowing Perl to interact with Oracle databases. The current version of Perl, Perl 5, contains long-sought object-oriented features.
Because Perl is such a powerful and extensible language, you’ll find a lot of discussion of Perl throughout this book. Chapter 2 describes how to install Perl and connect it to Oracle databases. Chapter 3 focuses on Perl’s use as a scripting language, especially the use of the Perl/Tk GUI toolkit. Chapter 4 and Chapter 6 provide examples of some excellent Perl-based Oracle applications.
John Ousterhout began developing Tcl (Tool Command Language) in 1987 with the goal of creating a generic language that his students could use for all of their projects. They had been spending too much time developing new control languages for each individual project and too little time on their actual research. Ousterhout had the following objectives for Tcl:
Extensibility, so new Tcl applications could add their own features to the basic structure of the language
Simplicity, so Tcl could work easily with different applications and not restrict them
Good facilities for integration and the ability to easily blend in any future language extensions
As with Perl, Ousterhout future-proofed Tcl, making it possible for new modules to be added as required by other developers. One of the most popular of these modules has been Ousterhout’s own GUI toolkit (Tk). Since 1988, Tcl/Tk has been hugely successful with hundreds of thousands of users worldwide, ranging from NASA to teenage bedroom hackers. Chapter 4 describes how to install Tcl/Tk and how to use the modules that provide an interface to Oracle databases.
In late 1989, Guido van Rossum was working with the Amoeba distributed operating system, trying to create more useful tools for Amoeba system administration. He began work on a new language, which he called Python. By 1991, he had made the language publicly available, and Python has grown in popularity ever since. Python was designed from the start as an object-oriented language, which distinguishes it from scripting languages such as Perl and Tcl. Van Rossum’s goals for Python were to make it:
Portable, so it would be truly operating system-independent
Easy to learn
In possession of a powerful standard library
Like Perl and Tcl, Python has a huge set of features and is appropriate for just about any programming purpose you can think of.
In Chapter 3, we’ll describe how to install Python and its Tkinter windowing system. We’ll also describe how to use Python to connect to Oracle databases.
have a single founding father, in the way that Linux, Perl, Tcl, and
Python do. The
daemon program (
httpd ), developed by
Rob McCool at
the National Center for Supercomputing Applications (NCSA),
University of Illinois, Urbana-Champaign, was the root project that
Behlendorf, Cliff Skolnick, and a number of other
programmers ultimately turned into an open source web server. Apache
has become enormously successful, and it is the leading web server in
use today. It is a stable, scalable, and highly efficient product.
Even Oracle Corporation is now embedding Apache within its Internet
Application Server (iAS).
We’ll describe how to install Apache in Chapter 5. In Chapter 6, we’ll touch on Apache’s use as an underlying technology for Oracle-based web applications, and in Chapter 7 and Chapter 8, we’ll describe its use with Java applications—in particular, the Apache JServ web server now used by Oracle.
For most of its history, Oracle has not been particularly welcoming to open source from a corporate point of view, though individual developers and DBAs have historically used open source tools, like Perl, and shared their scripts and expertise with their peers.
But in July of 1998, the company announced that the Oracle8i database would be ported to Linux, with detailed information available at the annual Oracle Open World conference (held in San Francisco that fall), to which Linus Torvalds was invited as a guest speaker. At this conference, Kevin Walsh, an Oracle vice president, revealed the change in Oracle’s thinking by calling upon Sun to put Java into the open source community. He thought Java would develop faster if Sun loosened its control:
Sun should still decide what goes into Sun-blessed Java, but if they open the process, all those freeware versions of Java would have a lot less momentum. Linux is in many ways a reaction to Java. Open source is a different development model than what Sun has been pursuing, but it still merits consideration.
The conference was also attended by representatives from Intel and Netscape, who were as determined as Oracle to make Linux a non-Microsoft success.
The Oracle port to Linux was a watershed event in the history of open source. It was, of course, a significant event in its own right—the porting of the world’s leading database to an open source platform. But the timing was also critical. Oracle’s embrace of open source—at least in a limited way—sent a signal that the corporate world was taking open source seriously and making product decisions accordingly. In his “Revenge of the Hackers” essay, Eric Raymond wrote about this event:
To sustain the momentum, we needed commitments not from hungry second-stringers but from industry leaders. Thus, it was the mid-July announcements by Oracle and Informix that really closed out this vulnerable phase.
The promised port of Oracle8i to Linux arrived in a blur of activity that also included IBM’s port of DB2, Oracle’s biggest database rival in the high-end sphere, to Linux. Other vendors fell into line, and soon Linux ports of major commercial software were almost a matter of routine.
For those of us in the old guard, the availability of Oracle on Linux merely underlined what we knew all along, but for many others, this event legitimized Linux as a solid and serious business player. What’s more, there was now an effective server-space alternative to Oracle on Windows NT for small development shops. If you wanted to build your own Intel-based server but didn’t need all the horsepower (and cost) of a Sun enterprise-class computer, you could now provide Oracle-based solutions without resorting to NT.
Some of us Linux zealots actually had Oracle running on Linux before
the fateful day of the official port. Back in the days of Oracle7, it
was, in fact, possible to get the SCO-Unix version of Oracle up and
running with an emulator called
iBCS2. But this
approach was really only for the diehards; it certainly wasn’t
supported, and there were quite a few limitations.
We won’t spend a lot of time in this book describing Linux (though Appendix A, does contain guidelines for installing Oracle8i on this platform). There are many books that describe this wonderful operating system. With the latest version of the Linux kernel now released with support for a journaling file system and non-buffered, raw I/O devices—and even loose talk of parallel server support—there seem to be no limits other than the amount of Jolt Cola that can be consumed by Alan Cox to keep the Linux bandwagon rolling. Linux has become a huge open source success story.
 Many words have been written about the concept and value of open source. We’re making a quick journey through a landscape that others have explored more fully, so we aren’t likely to do complete justice to the history and philosophy of open source. We’ll do our best to be accurate within the limits of this short chapter, but we recommend that you learn the full and very interesting story by consulting the excellent references listed in Appendix C.
 The term “open source” was coined at a meeting in February 1998 of the first participants in what would later become the Open Source Initiative.
 In this book, we occasionally use the term “hacker.” As used here, the term is more or less synonymous with “open source programmer.” Although “hacker” is sometimes used in the popular press to identify a person who breaks into systems (more properly, that’s a “cracker”), that’s not what we mean here. For more on hackers, see the definition in “The New Hackers Dictionary” (http://eps.mcgill.ca/jargon/ ).
 Eric S. Raymond, The Cathdral and the Bazaar (O’Reilly), pp. 63, 79.
 In our opinion, the pantheon also includes Linus Torvalds, John Ousterhout, and Guido van Rossum (the creators of Linux, Tcl, and Python, respectively), as well as Richard M. Stallman (as Zeus) and Eric Raymond (as Hermes); we’ll discuss Stallman a bit later in this chapter.
 Raymond, p. 79.
 GNU stands recursively for “GNU’s Not Unix.”
 Stallman himself officially left MIT in 1984 in order to prevent MIT from copyrighting his GNU work, but he was allowed to continue using the MIT AI Lab facilities.
 Many elements of the new operating system also came from BSD-Unix and the work of people such as Marshall Kirk McKusick. In 1994, Berkeley Software Design Incorporated, or BSDI (see http://www.bsdi.com), won a court battle with the torch-bearers of proprietary Unix to allow them to release a form of BSD-Unix (one stripped of any AT&T components) for free redistribution (4.4BSD-Lite, Release 2).
 This essay and others by Raymond were collected into a book, also called The Cathedral and the Bazaar (O’Reilly). See http://www.oreilly.com/catalog/cb/ for information; for most of the essays in their original form, visit http://www.tuxedo.org/~esr/writings/cathedral-bazaar/index.html.
 New Perl modules go through an evolutionary process that begins with an individual developer’s code, which she posts to CPAN. As others learn about the new module and start downloading, testing, and relying on it, it becomes more and more acceptable. If it’s good enough, and if enough people and products rely upon it, the Perl gods ultimately might decide to include the new module in the next general Perl distribution.
 Raymond, p. 179.