Chapter 1. Introduction

Since its first public release in 1991, Linux has been put to ever wider uses. Initially confined to a loosely tied group of developers and enthusiasts on the Internet, it eventually matured into a solid Unix-like operating system for workstations, servers, and clusters. Its growth and popularity accelerated the work started by the Free Software Foundation (FSF) and fueled what would later be known as the open source movement. All the while, it attracted media and business interest, which contributed to establishing Linux’s presence as a legitimate and viable choice for an operating system.

Yet, oddly enough, it is through an often ignored segment of computerized devices that Linux is poised to become the preferred operating system. That segment is embedded systems, and the bulk of the computer systems found in our modern day lives belong to it. Embedded systems are everywhere in our lives, from mobile phones to medical equipment, including air navigation systems, automated bank tellers, MP3 players, printers, cars, and a slew of other devices about which we are often unaware. Every time you look around and can identify a device as containing a microprocessor, you’ve most likely found another embedded system.

If you are reading this book, you probably have a basic idea why one would want to run an embedded system using Linux. Whether because of its flexibility, its robustness, its price tag, the community developing it, or the large number of vendors supporting it, there are many reasons for choosing to build an embedded system with Linux and many ways to carry out the task. This chapter provides the background for the material presented in the rest of the book by discussing definitions, real-life issues, generic embedded Linux systems architecture, examples, and methodology.

Definitions

The words “Linux,” “embedded Linux,” and “real-time Linux” are often used with little reference to what is being designated. Sometimes, the designations may mean something very precise. Other times, a broad range or category of applications is meant. Let us look at these terms and what they mean in different situations.

What Is Linux?

Linux is interchangeably used in reference to the Linux kernel, a Linux system, or a Linux distribution. The broadness of the term plays in favor of the adoption of Linux, in the large sense, when presented to a nontechnical crowd, but can be bothersome when providing technical explanations. If, for instance, I say: “Linux provides TCP/IP networking.” Do I mean the TCP/IP stack in the kernel or the TCP/IP utilities provided in a Linux distribution that are also part of an installed Linux system, or both? This vagueness actually became ammunition for the proponents of the “GNU/Linux” moniker, who pointed out that Linux was the kernel, but that the system was mainly built on GNU software.

Strictly speaking, Linux refers to the kernel maintained by Linus Torvalds and distributed under the same name through the main repository and various mirror sites. This codebase includes only the kernel and no utilities whatsoever. The kernel provides the core system facilities. It may not be the first software to run on the system, as a bootloader may have preceded it, but once it is running, it is never swapped out or removed from control until the system is shut down. In effect, it controls all hardware and provides higher-level abstractions such as processes, sockets, and files to the different software running on the system.

As the kernel is constantly updated, a numbering scheme is used to identify a certain release. This numbering scheme uses three numbers separated by dots to identify the releases. The first two numbers designate the version, and the third designates the release. Linux 2.4.20, for instance, is version number 2.4, release number 20. Odd version numbers, such as 2.5, designate development kernels, while even version numbers, such as 2.4, designate stable kernels. Usually, you should use a kernel from the latest stable series for your embedded system.

This is the simple explanation. The truth is that far from the “official” releases, there are many modified Linux kernels that you may find all over the Internet that carry additional version information. 2.4.18-rmk3-hh24, for instance, is a modified kernel distributed by the Familiar project. It is based on 2.4.18, but contains an extra “-rmk3-hh24” version number controlled by the Familiar development team. These extra version numbers, and the kernel itself, will be discussed in more detail in Chapter 5.

Linux can also be used to designate a hardware system running the Linux kernel and various utilities running on the kernel. If a friend mentions that his development team is using Linux in their latest product, he probably means more than the kernel. A Linux system certainly includes the kernel, but most likely includes a number of other software components that are usually run with the Linux kernel. Often, these will be composed of a subset of the GNU software such as the C library and binary utilities. It may also include the X window system or a real-time addition such as RTAI.

A Linux system may be custom built, as you’ll see later, or can be based on an already available distribution. Your friend’s development team probably custom built their own system. Conversely, when a user says she runs Linux on the desktop, she most likely means that she installed one of the various distributions, such as Red Hat or Debian. The user’s Linux system is as much a Linux system as that of your friend’s, but apart from the kernel, their systems most likely have very different purposes, are built from very different software packages, and run very different applications.

Finally, Linux may also designate a Linux distribution. Red Hat, Mandrake, SuSE, Debian, Slackware, Caldera, MontaVista, Embedix, BlueCat, PeeWeeLinux, and others are all Linux distributions. They may vary in purpose, size, and price, but they share a common purpose: to provide the user with a shrinkwrapped set of files and an installation procedure to get the kernel and various overlaid software installed on a certain type of hardware for a certain purpose. Most of us are familiar with Linux distributions through CD-ROMs, but there are distributions that are no more than a set of files you retrieve from a web site, untar, and install according to the documentation. The difference between mainstream, user-oriented distributions and these distributions is the automated installation procedure in the mainstream ones.

Starting with the next chapter and in the rest of this book, I will avoid referring to the word “Linux” on its own. Instead, I will refer directly to the object of discussion. Rather than talking about the “Linux kernel,” I will refer to the “kernel.” Rather than talking about the “Linux system,” I will refer to the “system.” Rather than talking about a “Linux distribution,” I will refer to a “distribution.” In all these circumstances, “Linux” is implied but avoided to eliminate any possible confusion. I will continue, however, to use the term “Linux,” where appropriate, to designate the broad range of software and resources surrounding the kernel.

What Is Embedded Linux?

Again, we could start with the three designations Linux suggests: a kernel, a system, and a distribution. Yet, we would have to take the kernel off the list right away, as there is no such thing as an embedded version of the kernel distributed by Linus. This doesn’t mean the kernel can’t be embedded. It only means you do not need a special kernel to create an embedded system. Often, you can use one of the official kernel releases to build your system. Sometimes, you may want to use a modified kernel distributed by a third party, one that has been specifically tailored for a special hardware configuration or for support of a certain type of application. The kernels provided with the various embedded distributions, for example, often include some optimizations not found in the main kernel tree and are patched for support for some debugging tools such as kernel debuggers. Mainly, though, a kernel used in an embedded system differs from a kernel used on a workstation or a server by its build configuration. Chapter 5 covers the build process.

An embedded Linux system simply designates an embedded system based on the Linux kernel and does not imply the use of any specific library or user tools with this kernel.

An embedded Linux distribution may include: a development framework for embedded linux systems, various software applications tailored for usage in an embedded system, or both.

Development framework distributions include various development tools that facilitate the development of embedded systems. This may include special source browsers, cross-compilers, debuggers, project management software, boot image builders, and so on. These distributions are meant to be installed on the development host.

Tailored embedded distributions provide a set of applications to be used within the target embedded system. This might include special libraries, execu, and configuration files to be used on the target. A method may also be provided to simplify the generation of root filesystems for the target system.

Because this book discusses embedded Linux systems, there is no need to keep repeating “embedded Linux” in every name. Hence, I will refer to the host used for developing the embedded Linux system as the “host system,” or “host,” for short. The target, which will be the embedded Linux system will be referred to as the “target system,” or “target,” for short. Distributions providing development frameworks will be referred to as “development distributions.”[1] Distributions providing tailored software packages will be referred to as “target distributions.”

What Is Real-Time Linux?

Initially, real-time Linux designated the RTLinux project released in 1996 by Michael Barabanov under Victor Yodaiken’s supervision. The goal of the project was to provide deterministic response times under a Linux environment.

Nonetheless, today there are many more projects that provide one form or another of real-time responsiveness under Linux. RTAI, Kurt, and Linux/RK all provide real-time performance under Linux. Some projects’ enhancements are obtained by inserting a secondary kernel under the Linux kernel. Others enhance the Linux kernel’s response times by means of a patch.

The adjective “real-time” is used in conjunction with Linux to describe a number of different things. Mainly, it is used to say that the system or one of its components is supposed to have fixed response times, but if you use a strict definition of “real-time,” you may find that what is being offered isn’t necessarily “real-time.” I will discuss “real-time” issues and further define the meaning of this adjective in Section 1.2.1.2.

Real Life and Embedded Linux Systems

What types of embedded systems are built with Linux? Why do people choose Linux? What issues are specific to the use of Linux in embedded systems? How many people actually use Linux in their embedded systems? How do they use it? All these questions and many more come to mind when pondering the use of Linux in an embedded system. Finding satisfactory answers to the fundamental questions is an important part of building the system. This isn’t just a general statement. These answers will help you convince management, assist you in marketing your product, and most of all, enable you to evaluate whether your initial expectations have been met.

Types of Embedded Linux Systems

We could use the traditional segments of embedded systems such as aerospace, automotive systems, consumer electronics, telecom, and so on to outline the types of embedded Linux systems, but this would provide no additional information in regard to the systems being designated, because embedded Linux systems may be structured alike regardless of the market segment. Rather, let’s classify embedded systems by criteria that will provide actual information about the structure of the system: size, time constraints, networkability, and degree of user interaction.

Size

The size of an embedded linux system is determined by a number of different factors. First, there is physical size. Some systems can be fairly large, like the ones built out of clusters, while others are fairly small, like the Linux watch built by IBM. Most importantly, there are the size attributes of the various electronic components of the system, such as the speed of the CPU, the size of the RAM, and the size of the permanent storage.

In terms of size, I will use three broad categories of systems: small, medium, and large. Small systems are characterized by a low-powered CPU with a minimum of 2 MB of ROM and 4 MB of RAM. This isn’t to say Linux won’t run in smaller memory spaces, but it will take you some effort to do so. If you plan to run Linux in a smaller space than this, think about starting your work from one of the various distributions that put Linux on a single floppy. If you come from an embedded systems background, you may find that you could do much more using something other than Linux in such a small system. Remember to factor in the speed at which you could deploy Linux, though.

Medium-sized systems are characterized by a medium-powered CPU with around 32 MB or ROM and 64 MB of RAM. Most consumer-oriented devices built with Linux belong to this category. This includes various PDAs, MP3 players, entertainment systems, and network appliances. Some of these devices may include secondary storage in the form of solid-state drives, CompactFlash, or even conventional hard drives. These types of devices have sufficient horsepower and storage to handle a variety of small tasks or can serve a single purpose that requires a lot of resources.

Large systems are characterized by a powerful CPU or collection of CPUs combined with large amounts of RAM and permanent storage. Usually, these systems are used in environments that require large amounts of calculations to carry out certain tasks. Large telecom switches and flight simulators are prime examples of such systems. Typically, such systems are not bound by costs or resources. Their design requirements are primarily based on functionality while cost, size, and complexity remain secondary issues.

In case you were wondering, Linux doesn’t run on any processor below 32 bits. This rules out quite a number of processors traditionally used in embedded systems. Actually, according to traditional embedded system standards, all systems running Linux would be classified as large systems. This is very true when compared to an 8051 with 4K of memory. Keep in mind, though, current trends: processors are getting faster, RAM is getting cheaper and larger, systems are as integrated as ever, and prices are going down. With growing processing demands and increasing system requirements, the types of systems Linux runs on are quickly becoming the standard. In some cases, however, it remains that an 8-bit microcontroller might be the best choice.

Time constraints

There are two types of time constraints for embedded systems: stringent and mild. Stringent time constraints require that the system react in a predefined time frame. Otherwise, catastrophic events happen. Take for instance a factory where workers have to handle materials being cut by large equipment. As a safety precaution, optical detectors are placed around the blades to detect the presence of the specially colored gloves used by the workers. When the system is alerted that a worker’s hand is in danger, it must stop the blades immediately. It can’t wait for some file to get swapped or for some task to relinquish the CPU. This system has stringent time requirements; it is a hard real-time system.

Streaming audio systems would also qualify as having stringent requirements, because any transient lagging is usually perceived as bothersome by the users. Yet, this later example would mostly qualify as a soft real-time system because the failure of the application to perform in a timely fashion all the time isn’t catastrophic as it would be for a hard real-time system. In other words, although infrequent failures will be tolerated, the system should be designed to have stringent time requirements.

Mild time constraints vary a lot in requirements, but they generally apply to systems where timely responsiveness isn’t necessarily critical. If an automated teller takes 10 more seconds to complete a transaction, it’s generally not problematic. The same is true for a PDA that takes a certain number of seconds to start an application. The extra time may make the system seem slow, but it won’t affect the end result.

Networkability

Networkability defines whether a system can be connected to a network. Nowadays, we can expect everything to be accessible through the network, even the refrigerator. This, in turn, places special requirements on the systems being built. One factor pushing people to choose Linux as an embedded OS is its proven networking capabilities. Falling prices and standardization of networking components are accelerating this trend. Most Linux devices have one form or another of network capability. You can attach a wireless network card in the Linux distribution built for the Compaq iPAQ, for instance, simply by inserting the adapter in the PCMCIA jacket. Networking issues will be discussed in detail in Chapter 10.

User interaction

The degree of user interaction varies greatly from one system to another. Some systems, such as PDAs, are centered around user interaction, while others, such as industrial process control systems, might only have LEDs and buttons for interaction. Some other systems, have no user interface whatsoever. For example, some components of an autopilot system in a plane might take care of wing control but have no direct interaction with the human pilots.

Examples

The best way to get an idea of what an embedded Linux system might do is to look at examples of such systems. Trouble is, if you try to look for example embedded systems whose details are publicly available on the Internet, you will mostly find consumer devices. Very few examples of Linux in aerospace, industrial control, telecom, or automotive systems are publicly detailed. Yet, it isn’t as if Linux wasn’t used in those types of applications. Rather, in contrast to consumer devices, the builders of such devices see little advantage in advertising their designs. For all they know, they may be providing critical information to competitors who may decide to switch to Linux to catch up with them. Consumer device builders, on the other hand, leverage the “hype” factor into promoting their consumer products. And given the different market dynamics between consumer products and industrial products, they can afford to play to the crowd.

Surprisingly (or maybe not so surprising after all), some of the best examples of Linux in critical systems are provided in the pages of Linux Journal magazine. Digging back a few years, I was able to uncover a treasure of non-consumer-oriented embedded applications based on Linux. This, combined with the consumer devices detailed on the Internet and the statistics we shall see below, provide a fair image of Linux’s capabilities and future as an embedded operating system. Table 1-1 contains a summary of the example embedded Linux systems discussed below. The first column is a brief description of the system. The second column details the type of the embedded system. The next four columns characterize the system based on the criteria outlined in the previous section.

Table 1-1. Example embedded Linux systems’ characteristics

Description

Type

Size

Time constraints

Networkability

Degree of user interaction

Accelerator control

Industrial processcontrol

Medium

Stringent

Yes

Low

Computer-aided training system

Aerospace

Large

Stringent

No

High

Ericsson “blip”

Networking

Small

Mild

Yes

Very low

SCADA protocolconverter

Industrial processcontrol

Medium

Stringent

No

Very low

Sharp Zaurus

Consumer electronics

Medium

Mild

Yes

Very high

Space vehicle control

Aerospace

Large

Stringent

Yes

High

Accelerator control

The accelerator control system was built at the European Synchrotron Radiation Facility and is described in issue 66 of Linux Journal. The accelerator equipment is built of many hardware and software components that control all the aspects of experimentation. While not all software was transferred to Linux, some interesting parts have been. This includes the serial line and stepper motor controllers. Many instances of these devices are employed to control various aspects of the system. Serial lines, for instances, control vacuum devices, power supplies, and programmable logic controllers (PLCs). Stepper motors, on the other hand, are used in positioning goniometers, slits, and translation stages. Serial lines are controlled via serial boards running on PC/104.

The PC/104 single board computer (SBC) controlling the serial boards has a Pentium 90 MHz with 20 MB of RAM and a 24 MB solid-state hard disk. A standard workstation distribution, SuSE 5.3, was trimmed down to fit in the limited permanent storage space. Some stepper motor controllers run on a similar configuration, while others run on VME boards that have 8 to 32 MB of memory and load the operating system from a Unix-type server using BOOTP/TFTP. These boards run a modified version of Richard Hirst’s Linux for 680x0-based VME boards. All the equipment is network accessible and controllable through a TCP/IP network. Here, Linux, in the broad sense, was chosen because it is configurable, stable, free, and well supported, contains support for many standards, and its source code is accessible.

Computer-aided training system

The computer-aided training system (CATS) was built at CAE Electronics and is described in issue 64 of Linux Journal. Unlike full flight simulators, which include visual, sound, and motion simulation, CATS provides only a visual representation of the various aircraft panels. A CATS isn’t a cheap version of a flight simulator. Instead, it complements a flight simulator by providing entry-level training. Conventional CAE CATS were built on IBM RS/6000 workstations running AIX. A port to Linux was prompted by the low cost of powerful x86 systems and the portability of Linux itself.

The CATS come in three different versions: one-, three-, and seven-screen systems. Development and testing was done on a workstation equipped with a Pentium II 350 MHz processor, 128 MB of RAM, and Evolution4 graphic cards from Color Graphics Systems, which provide for control of four displays each. Xi Graphics’ AcceleratedX X server was used to control the Evolution4 and provide adequate multiheaded display. A single-screen version could still run easily on a Linux system equipped with the standard XFree86 X server.

Because of customer requirements, the system was provided on a bootable CD-ROM to avoid local installation. Hence, the complete CATS is run from the CD-ROM using a RAM filesystem. The end system has been found to be reliable, predictable, dependable, stable, and in excess of performance requirements. Work on prototype flight simulators running Linux began in April 2000. Having had very positive results, most full flight simulators currently shipped run Linux.

Ericsson “blip”

The Ericsson “blip” is a commercial product. Details of the product can be found on Ericsson’s blip web site at http://www.ericsson.com/about/blipnet/ and on LinuxDevices.com. “blip” stands for “Bluetooth Local Infotainment Point” and enables Bluetooth devices to access local information. This product can be used either in public places to provide services or at home for accessing or synchronizing with local information.

The blip houses an Atmel AT91F40816 ARM7TDMI paced at 22.5 MHz with 2 MB of RAM, 1 MB of system flash, and 1 MB of user flash. The Atmel chip runs the uClinux distribution, with kernel 2.0.38 modified for MMU-less ARM, provided by Lineo along with uClibc, the miniature C library, and talks via a serial link to a standalone Bluetooth chip. Access to the device is provided by a proprietary Bluetooth stack, an Ethernet interface, and a serial port. Custom applications can be developed for the blip using an SDK provided by Ericsson and built using customized GNU software. Linux was chosen, because it provided an open and inexpensive development environment both for the host and the target, hence encouraging and stimulating the development of third-party software.

SCADA protocol converter

The System Control and Data Acquisition (SCADA) protocol converter is detailed in issue 77 of Linux Journal. Here, an existing Digital Control System (DCS) controlling a turbocompressor in an oil extraction plant had to be integrated into a SCADA system to facilitate management of the plant. Converting the complete DCS for better integration would have been expensive, hence the choice was made to build a conversion gateway that interfaced between the existing DCS and the SCADA system.

Linux was chosen because it is easy to tailor, it is well documented, it can run from RAM, and development can be done directly on the target system. An 8 MB DiskOnChip (DOC) from M-Systems provides a solid-state drive for the application. To avoid patching the kernel with the binary drivers provided by M-Systems, the DOC’s format is left in its shipped configuration as a DOS filesystem.[2] The kernel and root filesystem are compressed and placed in the DOC along with DOS. Upon bootup, the batch files invoke Loadlin to load Linux and the root filesystem. The system files are therefore read-only and the system is operated using a RAM root filesystem. The root filesystem was built using Red Hat 6.1 following the BootDisk HOWTO instructions. The system is an industrial PC with 32 MB of RAM.

Sharp Zaurus

The Sharp Zaurus is a commercial product sold by Sharp Electronics. Details on the Zaurus can be found on its web site at http://www.myzaurus.com/ and on LinuxDevices.com. The Zaurus is a Personal Digital Assistant (PDA) completely based on Linux. As such, it comes equipped with all the usual PDA applications, such as contacts, to do list, schedule, notes, calculator, email, etc.

The original Zaurus, the SL-5500, was built around an Intel StrongARM 206 MHz processor with 64 MB of RAM and 16 MB of flash. A newer version, the SL-5600, is built around an Intel XScale 400 MHz processor with 32 MB of RAM and 64 MB of flash. The system is based on Lineo’s Embedix embedded Linux distribution and uses QT’s Palmtop GUI. Independent development of the Zaurus software is encouraged by Sharp who maintains a developer web site at http://developer.sharpsec.com/.

Space vehicle control

The space vehicle control was built at the European Space Agency (ESA) and is detailed in issue 59 of Linux Journal. The Automatic Transfer Vehicle (ATV) is an unmanned space vehicle used in the refueling and reboosting of the International Space Station (ISS). The docking process between the ATV and the ISS requires the ATV to catch up to the ISS and dock with precision. This process is governed by complex mathematical equations. Given this complexity, monitoring systems are needed to ensure that all operations proceed as planned. This is the role of the Ground Operator Assistant System (GOAS) and the Remote ATV Control at ISS (RACSI).

The GOAS runs on the ground and provides monitoring and intervention capabilities. It used to run on a Sun UltraSPARC 5-based workstation with 64 MB of RAM and 300 MB of disk space. It was ported to a Pentium 233 MHz system with 48 MB of RAM running Linux.

The RACSI runs on the ISS and provides temporary mission interruption and collision avoidance. It runs on an IBM ThinkPad with 64 MB of RAM and uses 40 MB of the available disk space. The system runs the Slackware 3.0 distribution. Moo-Tiff libraries are used to provide Motif-like widgets.

Linux was chosen, because it provides the reliability, portability, performance, and affordability needed by space applications. Despite these benefits, the ESA finally decided to run the RACSI and GOAS on Solaris, using the same equipment, for operational reasons.

As these examples show, Linux can be put to use in many fields in many ways, using different hardware and software configurations. The fastest way to build an embedded system with Linux is often to look at similar projects that have used Linux in their systems. There are many more examples of embedded systems based on Linux that I have not discussed. A search through the various resources listed in Appendix B may yield fruitful leads. Keep in mind, though, that copying other projects may involve copying other people’s mistakes. In that case, the best way to guard yourself from chasing down other people’s problems is to ensure that you have an understanding of all the aspects of the system or, at least, have a pointer where you can find more information regarding the gray areas of your system.

Survey Findings

Since Linux started being used as an embedded operating system, many surveys have been published providing information regarding various aspects of Linux’s use in this way. Though the complete results of many of the surveys are part of commercial reports, which are relatively expensive, there are a few interesting facts that have been publicized. Let’s look at the findings of some of these surveys.

In 2000, Embedded Systems Programming (ESP) magazine conducted a survey on 547 subscribers. The survey found that, though none considered it in 1998 and 1999, 38% of readers were considering using Linux as the operating system for their next design. This is especially interesting, as Linux came in only second to VxWorks, WindRiver’s flagship product. The survey also found that, though none were using it in 1998 and 1999, 12% of respondents were already using Linux in their embedded systems in 2000.

As part of reporting on embedded Linux, LinuxDevices.com set up a web-based survey in 2000 and 2001 that site visitors could fill to provide information regarding their use of Linux in embedded systems. Both years, a few hundred respondents participated in the survey. Though there were no control mechanisms to screen respondents, the results match those of other more formal surveys. Both surveys contained a lot of information. For the sake of simplicity, I will only mention the surveys’ most important findings.

In 2000, the LinuxDevices.com survey found that most developers considering the use of Linux in embedded systems were planning to use an x86, ARM, or PPC target with a custom board. The survey shows that most developers plan to boot Linux from a DiskOnChip or from a native flash device, and that the main peripherals included in the system would be Ethernet and data acquisition cards. The most important reasons developers have for choosing Linux are the superiority of open source software over proprietary offerings, the fact that source availability facilitates understanding the operating system, and the elimination of the dependency on a single operating system vendor. Developers reported using Red Hat, Debian, and MontaVista as their main embedded Linux distributions.

In 2001, the LinuxDevices.com survey found that developers plan to use Linux in embedded systems mostly based on x86, ARM, and PPC systems with custom boards. As in the previous survey, most developers plan to boot their system from some form of flash storage. In contrast with the previous survey, this survey provides insight regarding the amount of RAM and persistent storage developers intend to use. The majority of developers seem to want to use Linux with system having more than 8 MB of RAM and 8 MB of persistent storage. In this survey, developers justify their choice of Linux based on source code availability, Linux’s reliability and robustness, and its high modularity and configurability. Developers reported that Red Hat and Debian were their main embedded Linux distributions. Combined with the 2000 survey, the results of the 2001 LinuxDevices.com survey confirm a steady interest in Linux.

Another organization that has produced reports on Linux’s use in embedded systems is the Venture Development Corporation (VDC). Though mainly aimed at companies selling products to embedded Linux developers, the VDC’s reports published in 2001 and 2002 provide some interesting facts. First, the 2001 report states that the market for embedded Linux development tools products was worth $20 million in 2000 and would be worth $306 million by 2005. The 2001 report also finds that the leading vendors are Lineo, MontaVista, and Red Hat. The report finds that the key reasons developers have for selecting Linux are source code availability and the absence of royalties.

The 2002 VDC report included a web-based survey of 11,000 developers. This survey finds that the Linux distributions currently used by developers are Red Hat, Roll-Your-Own, and non-commercial distributions. Developers’ key reasons for choosing Linux are source code availability, reduced licensing, reliability, and open source development community support. Interestingly, the report also lists the most important factors inhibiting Linux’s use in embedded applications. The most important factor is real-time limitations, followed by doubts about availability and quality of support, and fragmentation concerns. In addition, the report states that respondents consult the open source community for support with technical issues regarding Linux, and that most are satisfied with the answers they get.

The Evans Data Corporation (EDC) has also conducted surveys in 2001 and 2002 regarding Linux’s use in embedded systems. The 2001 survey conducted on 500 developers found that Linux is fourth in the list of operating systems currently used in embedded systems, and that Linux was expected to be the most used embedded operating system in the following year. In 2002, the survey conducted on 444 developers found that Linux was still fourth in the list of operating systems currently used in embedded systems, and that Linux is as likely to be used as Windows as the operating system of choice for future designs.

While these results are partial and though it is too early to predict Linux’s full impact on the embedded world, it is clear that there is great interest in embedded Linux and that this interest is growing. Moreover, the results show that the interest for Linux isn’t purely amateuristic. Rather, Linux is being considered for and used in professional applications and is being preferred to a lot of the traditional embedded OSes. Also, contrary to popular belief and widespread FUD (fear, uncertainty, and doubt) Linux isn’t interesting only because it’s free. The fact that its source code is available, is highly reliable, and can easily be tailored to the task are other important reasons, if not more important. Interestingly, the Debian distribution is one of the favorite embedded distributions, even though no vendor is pushing this distribution on the market.

Reasons for Choosing Linux

Apart from the reasons polled by the various surveys mentioned above, there are various motivations for choosing Linux over a traditional embedded OS.

Quality and reliability of code

Quality and reliability are subjective measures of the level of confidence in the code. Although an exact definition of quality code would be hard to obtain, there are properties common programmers come to expect from such code:

Modularity and structure

Each separate functionality should be found in a separate module, and the file layout of the project should reflect this. Within each module, complex functionality is subdivided in an adequate number of independent functions.

Ease of fixing

The code should be (more or less) easy to fix for whoever understands its internals.

Extensibility

Adding features to the code should be fairly straightforward. In case structural or logical modifications are needed, they should be easy to identify.

Configurability

It should be possible to select which features from the code should be part of the final application. This selection should be easy to carry out.

The properties expected from reliable code are:

Predictability

Upon execution, the program’s behavior is supposed to be within a defined framework and should not become erratic.

Error recovery

In case a problematic situation occurs, it is expected that the program will take steps to recover from the problem and alert the proper authorities, usually the system administrator, with a meaningful diagnostic message.

Longevity

The program will run unassisted for long periods of time and will conserve its integrity regardless of the situations it encounters.

Most programmers agree that the Linux kernel and most projects used in a Linux system fit this description of quality and reliability of their codebase. The reason is the open source development model (see note below), which invites many parties to contribute to projects, identify existing problems, debate possible solutions, and fix problems effectively. You can expect to run Linux for years unattended without problems, and people have effectively done so. You can also select which system components you want to install and which you would like to avoid. With the kernel, too, you can select which features you would like during build configuration. As a testament to the quality of the code making up the various Linux components, you can follow the various mailing lists and see how quickly problems are pointed out by the individuals maintaining the various components of the software or how quickly features are added. Few other OSes provide this level of quality and reliability.

Tip

Strictly speaking, there is no such thing as the “open source” development model, or even “free software” development model. “Open source” and “free software” correspond to a set of licenses under which various software packages can be distributed. Nevertheless, it remains that software packages distributed under “open source” and “free software” licenses very often follow a similar development model. This development model has been explained by Eric Raymond in his seminal book, The Cathedral and the Bazaar (O’Reilly).

Availability of code

Code availability relates to the fact that Linux’s source code and all build tools are available without any access restrictions. The most important Linux components, including the kernel itself, are distributed under the GNU General Public License (GPL). Access to these components’ source code is therefore compulsory. Other components are distributed under similar licenses. Some of these licenses, such as the BSD license, for instance, permit redistribution of binaries without the original source code or the redistribution of binaries based on modified sources without requiring publication of the modifications. Nonetheless, the code for the majority of projects that contribute to the makeup of Linux is readily available without restrictions.

When source access problems arise, the open source and free software community seeks to replace the “faulty” software with an open source version providing similar capabilities. This contrasts with traditional embedded OSes, where the source code isn’t available or must be purchased for very large sums of money. The advantages of having the code available are the possibility of fixing the code without exterior help and the capability of digging into the code to understand its operation. Fixes for security weaknesses and performance bottlenecks, for example, are often very quickly available once the problem has been publicized. With traditional embedded OSes you have to contact the vendor, alert them of the problem, and await a fix. Most of the time, people simply find workarounds instead of waiting for fixes. For sufficiently large projects, managers even resort to purchasing access to the code to alleviate outside dependencies.

Hardware support

Broad hardware support means that Linux supports different types of hardware platforms and devices. Although a number of vendors still do not provide Linux drivers, considerable progress has been made and more is expected. Because a large number of drivers are maintained by the Linux community itself, you can confidently use hardware components without fear that the vendor may one day discontinue that product line. Broad hardware support also means that Linux runs on dozens of different hardware architectures, at the time of this writing. Again, no other OS provides this level of portability. Given a CPU and a platform, you can reasonably expect that Linux runs on it or that someone else has gone through a similar porting process and can assist you in your efforts. You can also expect that the software you write on one architecture be easily ported to another architecture Linux runs on. There are even device drivers that run on different hardware architectures transparently.

Communication protocol and software standards

Linux also provides broad communication protocol and software standards support as we’ll see throughout this book. This makes it easy to integrate Linux within existing frameworks and to port legacy software to Linux. You can easily integrate a Linux system within an existing Windows network and expect it to serve clients through Samba, while clients see little difference between it and an NT server. You can also use a Linux box to practice amateur radio by building this feature into the kernel. Likewise, Linux is a Unix clone, and you can easily port traditional Unix programs to it. In fact, most applications currently bundled with the various distributions were first built and run on commercial Unixes and were later ported to Linux. This includes all the software provided by the FSF. Most traditional embedded OSes are, in this regard, very limited and often provide support only for a limited subset of the protocols and software standards available.

Available tools

The variety of tools existing for Linux make it very versatile. If you think of an application you need, chances are others felt the need for this application before you. It is also likely that someone took the time to write the tool and made it available on the Internet. This is what Linus Torvalds did, after all. You can visit Freshmeat (http://www.freshmeat.net) and SourceForge (http://www.sourceforge.net) and browse around to see the variety of tools available.

Community support

Community support is perhaps one of the biggest strengths of Linux. This is where the spirit of the free software and open source community can most be felt. As with application needs, it is likely that someone has encountered the same problems as you in similar circumstances. Often, this person will gladly share his solution with you, provided you ask. The development and support mailing lists are the best place to find this community support, and the level of expertise found there often surpasses what can be found over expensive support phone calls with proprietary OS vendors. Usually, when you call a technical support line, you never get to talk to the engineers who built the software you are using. With Linux, an email to the appropriate mailing list will often get you straight to the person who wrote the software. Pointing out a bug and obtaining a fix or suggestions is thereafter a rapid process. As many programmers experience, seldom is a justified plea for help ignored, provided the sender takes the care to search through the archives to ensure that her question hasn’t already been answered.

Licensing

Licensing enables programmers to do with Linux what they could only dream of doing with proprietary software. In essence, you can use, modify, and redistribute the software with only the restriction of providing the same rights to your recepients. This, though, is a simplification of the various licenses used with Linux (GPL, LGPL, BSD, MPL, etc.) and does not imply that you lose control of the copyrights and patents embodied in the software you generate. These considerations will be discussed in Section 1.2.6. Nonetheless, the degree of liberty available is quite large.

Vendor independence

Vendor independence, as was demonstrated by the polls presented earlier, means that you do not need to rely on any sole vendor to get Linux or to use it. Furthermore, if you are displeased with a vendor, you can switch, because the licenses under which Linux is distributed provide you the same rights as the vendors. Some vendors, though, provide additional software in their distributions that isn’t open source, and you might not be able to receive service for this type of software from other vendors. Such issues must be taken in account when choosing any distribution. Mostly, though, you can do with Linux as you would do with a car. Since the hood isn’t welded shut, as with proprietary software, you can decide to get service from a mechanic other than the one provided by the dealership where you purchased it.

Cost

The cost of Linux is a result of open source licensing and is different from what can be found with traditional embedded OSes. There are three components of software cost in building a traditional embedded system: initial development setup, additional tools, and runtime royalties. The initial development setup cost comprises the purchase of development licenses from the OS vendor. Often, these licenses are purchased for a given number of “seats,” one for each developer. In addition, you may find the tools provided with this basic development package to be insufficient and may want to purchase additional tools from the vendor. This is another cost. Finally, when you deploy your system, the vendor will ask for a per-unit royalty. This may be minimal or large, depending on the type of device you produce and the quantities produced. Some mobile phone manufacturers, for instance, choose to implement their own OSes to avoid paying any royalties. This makes sense for them, given the number of units sold and the associated profit margins.

With Linux, this cost model is turned on its head. All development tools and OS components are available free of charge, and the licenses under which they are distributed prevent the collection of any royalties on these core components. Most developers, though, may not want to go chasing down the various software tools and components and figure out which versions are compatible and which aren’t. Most developers prefer to use a packaged distribution. This involves purchasing the distribution or may involve a simple download. In this scenario, vendors provide support for their distribution for a fee and offer services for porting their distributions to new architectures and developing new drivers for a fee. This is where their money is made. They may also charge for additional proprietary software packaged with their distribution. Compared to the traditional embedded software cost model, though, this is relatively inexpensive, depending on the distribution you choose.

Players of the Embedded Linux Scene

Unlike proprietary OSes, Linux is not controlled by a single authority who dictates its future, its philosophy, and its adoption of one technology or another. These issues and others are taken care of by a broad ensemble of players with different but complementary vocations and goals.

Free software and open source community

The free software and open source community is the basis of all Linux development and is the most important player in the embedded Linux arena. It is made up of all the developers who enhance, maintain, and support the various software components that make up a Linux system. There is no central authority within this group. Rather, there is a loosely tied group of independent individuals, each with his specialty. These folks can be found discussing technical issues on the mailing lists concerning them or at gatherings such as the Ottawa Linux Symposium. It would be hard to characterize these individuals as a homogeneous group, because they come from different backgrounds and have different affiliations. Mostly, though, they care a great deal about the technical quality of the software they produce. The quality and reliability of Linux, as discussed earlier, are a result of this level of care.

Your author is actually part of the free software community and has made a number of contributions. Besides maintaining a presence on some mailing lists and participating in the advancement of free software in various ways, I wrote and maintain the Linux Trace Toolkit, which is a set of components for the tracing of the Linux kernel. I have also contributed to other free software and open source projects, including RTAI and Adeos.

Throughout this book, I will describe quite a few components that are used in Linux systems. Each maintainer of or contributor to the components I will describe is a player in the free software and open source community.

Industry

Having recognized the potential of Linux in the embedded market, many companies have moved to embrace and promote Linux in this area. Industry players are important because they are the ones pushing Linux as an end-user product. Often, they are the first to receive feedback from those end users. Although postings on the various mailing lists can tell the developer how the software is being used, not all users participate in those mailing lists. Vendors must therefore strike an equilibrium between assisting their users and helping in the development of the various projects making up Linux without falling in the trap of wanting to divert development to their own ends. In this regard, many vendors have successfully positioned themselves in the embedded Linux market. Here are some of them.

Tip

The vendors listed here are mentioned for discussion purposes only. Your author has not evaluated the services provided by any of these vendors for the purposes of this book, and this list should therefore not be interpreted as any form of endorsement.

Red Hat

This Linux distribution is one of the most widely used, if not the most widely used. Other distributions have been inspired by this distribution or, at least, had to take it into consideration. Red Hat was one of the first Linux distributions and, as such, has an established name as a leader that has contributed time and again back to the community it took from. Through its acquisition of Cygnus, it procured some of the key developers of the GNU development toolchain. This adds to the list of key Linux contributors already working for Red Hat. Cygnus had already been providing these tools in a shrinkwrapped package to many embedded system developers. Red Hat continued on this trajectory. Although it does not sell an embedded distribution different from its standard distribution, it provides a development package for developing embedded Linux systems using its distribution. Red Hat maintains a web site about the projects it contributes to at http://sources.redhat.com/.

MontaVista

Founded by Jim Ready, an embedded industry veteran, MontaVista has positioned itself as a leader in the embedded Linux market through its products, services, and promotion of Linux in industrial applications. Its main product is MontaVista Linux, which is available in two versions: Professional and Carrier Grade. MontaVista has contributed to some open source projects including the kernel, ViewML, Microwindows, and LTT. Although MontaVista does not maintain a web site for the projects it contributes to, copies of some of its contributions can be found at http://www.mvista.com/developer/sourceforge.html.

LynuxWorks

This used to be known as Lynx Real-Time Systems and is one of the traditional embedded OS vendors. Contrary to other traditional embedded OS providers, Lynx decided to embrace Linux early and changed its name to reflect its decision. This, combined with the later acquisition of BSDi by WindRiver[3] and QNX’s decision to make its OS available for free to download, were signs that open source in general, and Linux in particular, are making serious inroads in the embedded arena. That said, LynuxWorks still develops, distributes, and supports Lynx. In fact, LynuxWorks promotes a twofold solution. According to LynuxWorks, programmers needing hard real-time performance should continue to use Lynx while those wanting open source solutions should use BlueCat, their embedded Linux distribution. LynuxWorks has even modified Lynx to enable unmodified Linux binaries to run as-is. The fact that LynuxWorks was already a successful embedded OS vendor and that it adopted Linux early confirms the importance of the move towards open source OSes in the embedded market.

There are also many small players who provide a variety of services around open source and free software. In fact, many open source and free software contributions are made by individuals who are either independent or work for small-size vendors. As such, the services provided by such small players are often on a par or sometimes surpass those provided by larger players. Here are some individuals and small companies who provide embedded Linux services and are active contributors to the open source and free software community: Alessandro Rubini, Bill Gatliff, CodePoet Consulting, DENX Software Engineering, Opersys, Pengutronix, System Design & Consulting Services, and Zee2.

Organizations

There are currently three organizational bodies aimed at promoting and encouraging the adoption of Linux in embedded applications: the Embedded Linux Consortium (ELC), Emblix, the Japan Embedded Linux Consortium, and the TV Linux alliance. The ELC was founded by 23 companies as a nonprofit vendor-neutral association and now includes more than 100 member companies. Its current goals include the creation of an embedded Linux platform specification inspired by the Linux Standard Base and the Single Unix Specification. It remains unclear whether the ELC’s specification will gain any acceptance from the very open source and free software developers that maintain the software the ELC is trying to standardize, given that the drafting of the standard is not open to the public, which is contrary to the open source and free software traditions. Emblix was founded by 24 Japanese companies with similar aims as the ELC but with particular emphasis on the Japanese market. The TV Linux alliance is a consortium that includes cable, satellite, and telecom technology suppliers and operators who would like to support Linux in set-top boxes and interactive TV applications.

These efforts are noteworthy, but there are other organizational bodies that influence Linux’s advancement, in the broad sense, although they do not address embedded systems particularly.

First and foremost, the Free Software Foundation (FSF), launched in 1984 by Richard Stallman, is the maintainer of the GNU project from which most components of a Linux system are based. It is also the central authority on the GPL and LGPL, the licenses most software in a Linux system fall under. Since its foundation, the FSF has promoted the usage of free software[4] in all aspects of computing. The FSF has taken note of the recent rise in the use of GNU and GPL software in embedded systems and is moving to ensure that user and developer rights are preserved.

The OpenGroup maintains the Single Unix Specification (SUS), which describes what should be found in a Unix system. There is also the Linux Standard Base (LSB) effort, which aims at developing and promoting “a set of standards that will increase compatibility among Linux distributions and enable software applications to run on any compliant Linux system,” as stated on the LSB web site at http://www.linuxbase.org/. In addition, the Filesystem Hierarchy Standard (FHS) maintained by the Filesystem Hierarchy Standard Group specifies the content of a Linux root tree. The Free Standards Group (FSG) maintains the Linux Development Platform Specification (LDPS), which specifies the configuration of a development platform to enable applications developed on conforming platforms to run on most distributions available. Finally, there is the Real-Time Linux Foundation, which aims at promoting and standardizing real-time enhancements and programming in Linux.

Resources

Most developers connect to the embedded Linux world through various resource sites and publications. It is through these sites and publications that the Linux development community, industry, and organizations publicize their work and learn about the work of the other players. In essence, the resource sites and publications are the meeting place for all the people concerned with embedded Linux. A list of resources can be found in Appendix B, but there are two resources that stand out, LinuxDevices.com and Linux Journal.

LinuxDevices.com was founded on Halloween day[5] 1999 by Rick Lehrbaum. It has since been acquired by ZDNet and, later still, been purchased by a company owned by Rick. To this day, Rick continues to maintain the site. LinuxDevices.com features news items, articles, polls, forums, and many other links pertaining to embedded Linux. Many key announcements regarding embedded Linux are made on this site. The site contains an archive of actively maintained articles regarding embedded Linux. Though its vocation is clearly commercial, I definitely recommend taking a peek at the site once in a while to keep yourself up to date with the latest in embedded Linux. Among other things, LinuxDevices.com was instrumental in launching the Embedded Linux Consortium.

As part of the growing interest in the use of Linux in embedded systems, the Embedded Linux Journal (ELJ) was launched by Specialized System Consultants, owners of Linux Journal (LJ), in January 2001 with the aim of serving the embedded Linux community, but was later discontinued. Though ELJ is no longer published as a separate magazine, LJ now contains an “embedded” section every month, which contains articles that otherwise would have been published in ELJ.

Copyright and Patent Issues

You may ask: what about using Linux in my design? Isn’t Linux distributed under this weird license that may endanger the copyrights and patents of my company? What are all those licenses anyway? Is there more than one license to take care of? Are we allowed to distribute binary-only kernel modules? What about all these articles I read in the press, some even calling Linux’s license a “virus”?

These questions and many more have probably crossed your mind. You probably even discussed some of these issues with some of your coworkers. The issues can be confusing and can come back to haunt you if they aren’t dealt with properly. I don’t say this to scare you. The issues are real, but there are known ways to use Linux without any fear of any sort of licensing contamination. With all the explanations provided below, it would be important to keep in mind that this isn’t legal counsel and I am not a lawyer. If you have any doubts about your specific project, consult your attorney.

OK, now that I’ve given you ample warning, let us look at what is commonly accepted thought on Linux’s licensing and how it applies to Linux systems in general, including embedded systems.

Textbook GPL

For most components making up a Linux system, there are two licenses involved, the GPL and the LGPL, introduced earlier. Both licenses are available from the FSF’s web site at http://www.gnu.org/licenses/, and should be included with any package distributed under the terms of these licenses.[6] The GPL is mainly used for applications, while the LGPL is mainly used for libraries. The kernel, the binary utilities, the gcc compiler, and the gdb debugger are all licensed under the GPL. The C library and the GTK widget toolkit, on the other hand, are licensed under the LGPL.

Some programs may be licensed under BSD, Mozilla, or another license, but the GPL and LGPL are the main licenses used. Regardless of the license being used, common sense should prevail. Make sure you know the licenses under which the components you use fall and understand their implications.

The GPL provides rights and imposes obligations very different from what may be found in typical software licenses. In essence, the GPL is meant to provide a higher degree of freedom to developers and users, enabling them to use, modify, and distribute software with few restrictions. It also makes provisions to ensure that these rights are not abrogated or hijacked in any fashion. To do so, the GPL stipulates the following:

  • You may make as many copies of the program as you like, as long as you keep the license and copyright intact.

  • Software licensed under the GPL comes with no warranty whatsoever, unless it is offered by the distributor.

  • You can charge for the act of copying and for warranty protection.

  • You can distribute binary copies of the program, as long as you accompany them with the source code used to create the binaries, often referred to as the “original” source code.[7]

  • You cannot place further restrictions on your recipients than what is provided by the GPL and the software’s original authors.

  • You can modify the program and redistribute your modifications, as long as you provide the same rights you received to your recipients. In effect, any code that modifies or includes GPL code, or any portion of a GPL’d program, cannot be distributed outside your organization under any license other than the GPL. This is the clause some PR folks refer to as being “virus"-like. Keep in mind, though, that this restriction concerns source code only. Packaging the unmodified software for the purpose of running it, as we’ll see below, is not subject to this provision.

As you can see, the GPL protects authors’ copyrights while providing freedom of use. This is fairly well accepted. The application of the modification and distribution clauses, on the other hand, generates a fair amount of confusion. To clear this confusion, two issues must be focused on: running GPL software and modifying GPL software. Running the software is usually the reason why the original authors wrote it. The authors of gcc, for example, wrote it to compile software with. As such, the software compiled by an unmodified gcc is not covered by the GPL, since the person compiling the program is only running gcc. In fact, you can compile proprietary software with gcc, and people have been doing this for years, without any fear of GPL “contamination.” Modifying the software, in contrast, creates a derived work that is based on the original software, and is therefore subject to the licensing of that original software. If you take the gcc compiler and modify it to compile a new programming language of your vintage, for example, your new compiler is a derived work and all modifications you make cannot be distributed outside your organization under the terms of any license other than the GPL.

Most anti-GPL speeches or writings play on the confusion between running and modifying GPL software, to give the audience an impression that any software in contact with GPL software is under threat of GPL “contamination.” This is not the case.

There is a clear difference between running and modifying software. As a developer, you can safeguard yourself from any trouble by asking yourself whether you are simply running the software as it is supposed to be run or if you are modifying the software for your own ends. As a developer, you should be fairly capable of making out the difference.

Note that the copyright law makes no difference between static and dynamic linking. Even if your proprietary application is integrated to the GPL software during runtime through dynamic linking, that doesn’t exclude it from falling under the GPL. A derived work combining GPL software and non-GPL software through any form of linking still cannot be distributed under any license other than the GPL. If you package gcc as a dynamic linking library and write your new compiler using this library, you will still be restricted from distributing your new compiler under any license other than the GPL.

Whereas the GPL doesn’t allow you to include parts of the program in your own program unless your program is distributed under the terms of the GPL, the LGPL allows you to use unmodified portions of the LGPL program in your program without any problem. If you modify the LGPL program, though, you fall under the same restrictions as the GPL and cannot distribute your modifications outside your organization under any license other than the LGPL. Linking a proprietary application, statically or dynamically, with the C library, which is distributed under the LGPL, is perfectly acceptable. If you modify the C library, on the other hand, you are prohibited from distributing all modifications under any license other than the LGPL.

Tip

Note that when you distribute a proprietary application that is linked against LGPL software, you must allow for this LGPL software to be replaced. If you are dynamically linking against a library, for example, this is fairly simple, because the recipient of your software need only modify the library to which your application is linked at startup. If you are statically linking against LGPL software, however, you must also provide your recipient with the object code of your application before it was linked so that she may be able to substitute the LGPL software.

Much like the running versus modifying GPL software discussion above, there is a clear difference between linking against LGPL software and modifying LGPL software. You are free to distribute your software under any license when it is linked against an LGPL library. You are not allowed to distribute any modifications to an LGPL library under any license other than LGPL.

Pending issues

Up to now, I’ve discussed only textbook application of the GPL and LGPL. Some areas of application are, unfortunately, less clearly defined. What about applications that run using the Linux kernel? Aren’t they being linked, in a way, to the kernel’s own code? And what about binary kernel modules, which are even more deeply integrated to the kernel? Do they fall under the GPL? What about including GPL software in my embedded system?

I’ll start with the last question. Including a GPL application in your embedded system is actually a textbook case of the GPL. Remember that you are allowed to redistribute binary copies of any GPL software as long as your recipients receive the original source code. Distributing GPL software in an embedded system is a form of binary distribution and is allowed, granted you respect the other provisions of the GPL regarding running and modifying GPL software.

Some proprietary software vendors have tried to cast doubts about the use of GPL software in embedded systems by claiming that the level of coupling found in embedded systems makes it hard to differentiate between applications and, hence, between what falls under GPL and what doesn’t. This is untrue. As we shall see in Chapter 6 and Chapter 8, there are known ways to package embedded Linux systems that uphold modularity and the separation of software components.

To avoid any confusion regarding the use of user applications with the Linux kernel, Linus Torvalds has added a preamble to the GPL license found with the kernel’s source code. This preamble has been reproduced verbatim in Appendix C and stipulates that user applications running on the kernel are not subject to the GPL. This means that you can run any sort of application on the Linux kernel without any fear of GPL “contamination.” A great number of vendors provide user applications that run on Linux while remaining proprietary, including Oracle, IBM, and Adobe.

The area where things are completely unclear is binary-only kernel modules. Modules are software components that can be dynamically loaded and unloaded to add functionality to the kernel. While they are mainly used for device drivers, they can and have been used for other purposes. Many components of the kernel can actually be built as loadable modules to reduce the kernel image’s size. When needed, the various modules can be loaded during runtime.

Although this was intended as a facilitating and customizing architecture, many vendors and projects have come to use modules to provide capabilities to the kernel while retaining control over the source code or distributing it under licenses different from the GPL. Some hardware manufacturers, for instance, provide closed-source binary-only module drivers to their users. This enables the use of the hardware with Linux without requiring the vendor to provide details regarding the operation of their device.

The problem is that once a module is loaded in the kernel, it effectively becomes part of its address space and is highly coupled to it because of the functions it invokes and the services it provides to the kernel. Because the kernel is itself under the GPL, many contend that modules cannot be distributed under any other license than the GPL because the resulting kernel is a derived work. Others contend that binary-only modules are allowed as long as they use the standard services exported to modules by the kernel. For modules already under the GPL, this issue is moot, but for non-GPL modules, this is a serious issue. Linus has said more than once that he allows binary-only modules as long as it can be shown that the functionality implemented is not Linux specific, as you can see in some of his postings included in Appendix C. Others, however, including Alan Cox, have come to question his ability to allow or disallow such modules, because not all the code in the kernel is copyrighted by him. Others, still, contend that because binary modules have been tolerated for so long, they are part of standard practice.

There is also the case of binary-only modules that use no kernel API whatsoever. The RTAI and RTLinux real-time tasks inserted in the kernel are prime examples. Although it could be argued that these modules are a class of their own and should be treated differently, they are still linked into kernel space and fall under the same rules as ordinary modules, whichever you think them to be.

At the time of this writing, there is no clear, definitive, accepted status for binary-only modules, though they are widely used and accepted as legitimate. Linus’ latest public statements on the matter, made during a kernel mailing list debate on the Linux Security Module infrastructure (reproduced verbatim in Appendix C), seem to point to the fact that the use of binary-only modules is an increasingly risky decision. In fact, the use of binary-only modules is likely to remain a legally dubious practice for the foreseeable future. If you think you need to resort to binary-only proprietary kernel modules for your system, I suggest you follow Alan Cox’s advice and seek legal counsel beforehand. Actually, I also suggest you reconsider and use GPL modules instead. This would avoid you many headaches.

RTLinux patent

Perhaps one of the most restrictive and controversial licenses you will encounter in deploying Linux in an embedded system is the license to the RTLinux patent held by Victor Yodaiken, the RTLinux project leader. The patent covers the addition of real-time support to general purpose operating systems as implemented by RTLinux.

Although many have questioned the patent’s viability, given prior art, and Victor’s handling of the issue, it remains that both the patent and the license are currently legally valid, at least in the United States, and have to be accounted for. The U.S. Patent Number for the RTLinux patent is 5,995,745, and you can obtain a copy of it through the appropriate channels. The patent license that governs the use of the patented method is available on the Web at http://www.fsmlabs.com/about/patent/.

The license lists a number of requirements for gratis use of the patented method. Notably, the license stipulates that there are two approved uses of the patented process. The first involves using software licensed under the terms of the GPL, and the second involves using an umodified version of the “Open RTLinux” as distributed by FSMLabs, Victor Yodaiken’s company. The traditional way in which these requirements have been read by real-time Linux developers is that anyone distributing non-GPL real-time applications needs to purchase a license from FSMLabs. Not so says Eben Moglen, the FSF’s chief legal counsel. In a letter that was sent to the RTAI community, the original of which is available at http://www.aero.polimi.it/~rtai/documentation/articles/moglen.html, Moglen makes the following statement: “No application in a running RTLinux or RTAI system does any of the things the patent claims. No applications program is therefore potentially infringing, and no applications program is covered, or needs to be covered, by the license.”

Though Moglen’s authoritative statement is clear on the matter, it remains that FSMLabs’ continued refusal to provide explanations regarding the patent’s reach has left a cloud of uncertainty regarding all real-time extensions using the patented process.

It follows from this that the only way to stay away from this mess is to avoid using the patented process altogether. In other words, another method than that covered by the patent must be used to obtain deterministic response times from Linux. Fortunately such a method exists.

Basing myself entirely on scientific articles on nanokernel research published more than one year earlier than the preliminary patent application, I wrote a white paper describing how to implement a Linux-based nanokernel to enable multiple OSes to share the same hardware. The white paper, entitled “Adaptive Domain Environment for Operating Systems,” was published in February 2001 and is available from http://www.opersys.com/adeos/ along with other papers on other possible uses of this method. Given that your author started working on this book soon after the paper’s publication, there was little development effort being put on the project, and the idea lay dormant for over a year.

The situation changed in late April 2002 when Philippe Gerum, a very talented free software developer, picked up the project and decided to push it forward. By early June, we were sufficiently satisfied with the project’s status to make the first public release of the Adeos nanokernel. The release made on June 3, 2002, was endorsed by several free software organizations throughout the world, including Eurolinux (http://www.eurolinux.org/) and April (http://www.april.org/), as a patent-free method for allowing real-time kernels to coexist with general purpose kernels. Though, as with any other patent, such endorsements do not constitute any guarantee against patent infringement claims, the consensus within the open source and free software community is that the Adeos nanokernel and its applications are indeed patent free. For my part, I encourage you to make your own verifications, as you should do for any patent. Among other things, review the original white paper and, most importantly, the scientific articles mentioned therein.

Already, Adeos is being used by developers around the world for allowing different types of kernels to coexist. RTAI, for instance, which previously used the patented process to take control from Linux, and was therefore subject to the patent license, has already been ported to Adeos. Though at the time of this writing Adeos runs on single processor and SMP x86 systems only, ports to other architectures should be relatively straightforward, given the nanokernel’s simplicity. If you are interested in contributing to Adeos, by porting it to other architectures for example, or if you would just like to use it or get more information, visit the project’s web site at http://www.adeos.org/.

Using Distributions

Wouldn’t it be simpler and faster to use a distribution instead of setting up your own development environment and building the whole target system from scratch? What’s the best distribution? Unfortunately, there are no straightforward answers to these questions. There are, however, some aspects of distribution use that might help you find answers to these and similar questions.

To use or not to use

First and foremost, you should be aware that it isn’t necessary to use any form of distribution to build an embedded Linux system. In fact, all the necessary software packages are readily available for download on the Internet. It is these same packages that distribution providers download and package for you to use. This approach provides you with the highest level of control and understanding over the packages you use and their interactions. Apart from this being the most thorough approach and the one used within this book, it is also the most time-consuming, as you have to take the time to find matching package versions and then set up each package one by one while ensuring that you meet package interaction requirements.

Hence, if you need a high degree of control over the content of your system, the “do it yourself” method may be best. If, however, like most people, you need the project ready yesterday or if you do not want to have to maintain your own packages, you should seriously consider using both a development and a target distribution. In that case, you will need to choose the development and target distributions most appropriate for you.

How to choose a distribution

There are a number of criteria to help in the choice of a distribution, some of which have already been mentioned in Section 1.2.3. Depending on your project, you may also have other criteria not discussed here. In any case, if you choose commercial distributions, make sure you have clear answers to your questions from the distribution vendor when you evaluate his product. As in any situation, if you ask broad questions, you will get broad answers. Use detailed questions and expect detailed answers. Unclear answers to precise questions are usually a sign that something is amiss. If, however, you choose an open source distribution,[8] make sure you have as much information as possible about it. The difference between choosing an open source distribution and a commercial distribution is the way you obtain answers to your questions about the distribution. Whereas the commercial distribution vendor will provide you with answers to your questions about his product, you may have to look for the answers to those same questions about an open source distribution on your own.

An initial factor in the choice of a development or target distribution is the license or licenses involved. Some commercial distributions are partly open source and distribute value-added packages under conventional software licenses prohibiting copying and imposing royalties. Make sure the distribution clearly states the licenses governing the usage of the value-added software and their applicability. If unsure, ask. Don’t leave licensing issues unclear.

Before evaluating a distribution, make yourself a shopping list of packages you would like to find in it. The distribution may have something better to offer, but at least you know if it fits your basic requirements. A development distribution should include items covered in Section 1.4.2, whereas a target distribution should automate and/or facilitate, to a certain degree, items covered in Section 1.4.1 and Section 1.4.4. Of course, no distribution can take away issues discussed in Section 1.4.3, since only the system developers know what type of programming is required for the system to fit its intended purpose.

One thing that distinguishes commercial distributions from open source distributions is the support provided by the vendor. Whereas the vendor supplying a commercial distribution almost always provides support for her own distribution, the open source community supplying an open source distribution does not necessarily provide the same level of support that would be expected from a commercial vendor. This, however, does not preclude some vendors from providing commercial support for open source distributions. Through serving different customers with different needs in the embedded field, the various vendors build a unique knowledge about the distributions they support and the problems clients might encounter during their use, and are therefore best placed to help you efficiently. Mainly, though, these vendors are the ones who keep up with the latest and greatest in Linux and are therefore the best source of information regarding possible bugs and interoperability problems that may show up.

Reputation can also come into play when choosing a distribution, but it has to be used wisely, as a lot of information circulating may be presented as fact while being mere interpretation. If you’ve heard something about one distribution or another, take the time to verify the validity of the information. In the case of a commercial distribution, contact the vendor. Chances are he knows where this information comes from and, most importantly, the rational explanation for it. This verification process, though, isn’t specific to embedded Linux distributions. What is specific to embedded Linux distributions is the reputation commercial distributions build when their vendors contribute to the open source community. A vendor that contributes back by providing more open source software or by financing development shows that he is in contact with the open source community and has therefore a privileged position in understanding how the changes and developments of the various open source projects will affect his future products and ultimately his clients. In short, this is a critical link and a testament to the vendor’s understanding of the dynamics involved in the development of the software he provides you. In the case of open source distributions, this criterion is already met, as the distribution itself is an open source contribution.

Another precious tool commercial distributions might have to offer is documentation. In this day and age where everything is ever-changing, up-to-date and accurate documentation is a rare commodity. The documentation for the majority of open source projects is often out of date, if available at all. Linus Torvalds’ words ring true here. “Use the source, Luke,” he says, meaning that if you need to understand the software you should read the source code. Yet not everyone can invest the amount of time necessary to achieve this level of mastery, hence the need for appropriate documentation. Because the open source developers prefer to invest time in writing more code than in writing documentation, it is up to the distribution vendors to provide appropriately packaged documentation with their distributions. When evaluating a distribution, make sure to know the type and extent of accompanying documentation. Although there is less documentation for open source distributions, in comparison with commercial distributions, some open source distributions are remarkably well documented.

Given the complexity of some aspects of development and target setup, the installation of a development and/or target distribution can be hard. In this regard, you may be looking for easy-to-install distributions. Although this is legitimate, keep in mind that once you’ve installed the distributions, you should not need to reinstall them afterward. Notice also that installation does not really apply for a target distribution, as it was defined earlier, because target distributions are used to facilitate the generation of target setups and don’t have what is conventionally known as an “installation” process. The three things you should look for in the installation process of a distribution are clear explanations (whether textually during the installation, in a manual, or both), configurability, and automation. Configurability is a measure of how much control you have over the packages being installed, while automation is the ability to automate the process using files containing the selected configuration options.

With some CPU models and boards being broadly adopted for embedded systems development, commercial distribution vendors have come to provide prepackaged development and/or target distributions specifically tailored for those popular CPU models and boards. If you are intending to use a specific CPU model or board, you may want to look for a distribution that is already tested for your setup.

What to avoid doing with a distribution

There is one main course of action to avoid when using a distribution: using the distribution in a way that makes you dependent solely on this same distribution for all future development. Remember that one of the main reasons to use Linux is that you aren’t subject to anyone’s will and market decisions. If your development relies solely on proprietary tools and methods of the distribution you chose, you are in risk of being locked into continuous use of that same distribution for all future development. This does not mean, though, that you shouldn’t use commercial distributions with value-added software the likes of which cannot be found on other distributions. It only means that you should have a backup plan to achieve the same results with different tools from different distributions, just in case.

Example Multicomponent System

To present and discuss the material throughout the book, this section will examine an example embedded Linux system. This embedded system is composed of many interdependent components, each of which is an individual embedded system. The complete system has a set of fixed functionalities, as seen by its users, but the individual components may vary in composition and implementation. Hence, the example provides us with fertile ground for discussing various solutions, their trade-offs, and their details. Overall, the system covers most types of embedded systems available, from the very small to the very large, including many degrees of user interaction and networking and covering various timing requirements.

General Architecture

The embedded system used as the basis of the examples in this book is an industrial process control system. It is composed of networked computers all running Linux. Figure 1-1 presents the general architecture of the example system.

Example embedded Linux system architecture
Figure 1-1. Example embedded Linux system architecture

Internally, the system is made up of four different types of machines, each fulfilling a different purpose: data acquisition (DAQ), control, system management (SYSM), and user interface (UI). The components interconnect using the most common interface and protocol available, TCP/IP over Ethernet. In this setup, the acquisition and control modules sit on a dedicated Ethernet link, while the user interface modules sit on another link. In addition to being the interface between the two links, the system control module provides an interface to the “outside world,” which may be a corporate intranet, through a third link.

The process being controlled here could be part of a factory, treatment facility, or something completely different, but this is of no importance to the main design being discussed, because all process control systems have similar architectures. To control a process, the system needs to know at all times the current state of the different components of the process. This is what data acquisition is for. Having acquired the data, the system can determine how to keep the process under control. The location where the analysis is conducted may vary, but all control commands will go out through the control module. Because some aspects of the process being controlled often need human interaction and/or monitoring, there has to be a way for the workers involved to observe and modify the process. This is provided by the various user interfaces. To glue all this together and provide a central data repository and management interface, the system control module is placed at the center of all the components while providing a single access point into the system from the outside world.

Requirements of Each Component

Each component has its own set of requirements to fit in the grand scheme of things and is, therefore, built differently. Here is a detailed discussion of each component.

Data acquisition module

The first components of process measurement are transducers. Transducers are devices that convert a physical phenomenon into an electrical signal. Thermocouples, strain gauges, accelerometers, and linear variable differential transformers (LVDTs) are all transducers that measure temperature, mechanical variations, acceleration, and displacement, respectively. The transducers are usually placed directly within the area where the process is taking place. If a furnace boils a liquid of which the temperature needs to be monitored, a thermocouple would be placed within the receptacle containing the liquid.

The electrical signals output by transducers often go through various stages of signal conditioning, which may include amplification, attenuation, filtering, and isolation, before eventually being fed to a DAQ device. The DAQ device, often a DAQ card installed in a computer, samples the analog values, converts them to digital values, and stores these values in a sample buffer. Various software components can then use these values to plot curves, detect certain conditions, or modify certain control parameters in reaction to the signal, such as in a feedback loop.

As DAQ is a vast domain discussed by a number of books, it is not the purpose of this chapter to discuss DAQ in full. Rather, we will assume that all signal retrieval and conditioning is already done. Also, rather than limiting the discussion to one DAQ card in particular, we will assume a DAQ card for which a driver exists complying with the API provided by Comedi, a software package for data acquisition and control, which I will cover later.

Hence, the DAQ module is an industrial computer containing a DAQ card controlled via Comedi to retrieve data about the process. The computer runs a medium-sized embedded system with stringent time constraints and no user interface, while being connected to the rest of the system using Ethernet.[9]

In a typical setup, the DAQ module stores the data retrieved in a local buffer. Analysis may be conducted on this data on site or it may be transferred to the SYSM module for analysis. In any case, important data is forwarded to the SYSM module for backup and display by the various UIs. When analysis is conducted onsite, the DAQ module will signal the SYSM module if it detects an anomaly or critical situation. Conversely, the DAQ module will carry out the DAQ commands sent by the SYSM module. These commands may dictate sample rate, analysis parameters, or even what the module should do with acquired data once analysis is over. For the SYSM module to be aware of the DAQ module’s operations, the DAQ module will forward status and errors to the SYSM module whenever necessary or whenever it is asked to do so.

The DAQ module typically boots off a CompactFlash or a native flash device and uses a RAM disk or CRAMFS. This lets the module be replaced easily in case of hardware failure. Software configuration involves a kernel built for preemption running on either a PC-type system or a system based on the PowerPC architecture. The DAQ provides no outside services such as FTP, HTTP, or NFS. Instead, it runs custom daemons that communicate with the SYSM module to carry out the proper behavior of the overall system. Because it is not a multiuser system and no user ever interacts with it directly, the DAQ module has only minimal support for user tools. This may involve the BusyBox package. The IP address used by the DAQ is fixed and determined at design time. Hence, the SYSM module can easily check whether the DAQ module is alive and operational.

Control module

Conventional process control involves programmable logic controllers (PLCs) and similar systems that are expensive, run their own particular OSes, and have special configuration procedures. With the advent of inexpensive modern hardware on the consumer market, it is becoming more common to see mainstream hardware such as PCs used in process control. Even industrial hardware has seen its price falling because of the use of mainstream technology.

Here too, process control is a vast domain and I do not intend to cover it in full. Instead, we assume that the hardware being controlled is modeled by a state machine. The overlaying software receives feedback to its control commands based on the current state of the controlled hardware as modeled by the state machine.

The control module is an industrial computer with an interface to the hardware being controlled. The computer runs a medium-sized embedded system with stringent time-constraints and no user interface, much like the DAQ module, while being connected to the rest of the system using an Ethernet link.

The control module’s main task is to issue commands to the hardware it controls, while monitoring the progression of the hardware’s behavior in reaction to the commands. Typically, these commands originate from the SYSM module, which is the central decision maker, and that will make decisions according to the data it gets from the DAQ module. Because the commands provided by the SYSM module may involve many hardware operations, the control module will coordinate the hardware to obtain the final result requested by the SYSM. Once operations are complete, whenever any special situation occurs or whenever it is requested, the control module will send the SYSM module a status report on the current hardware operations.

The control module can boot off a CompactFlash or a CFI flash device and use a RAM disk or CRAMFS, much like the DAQ module. It is based on a PowerPC board, which runs a kernel configured for preemption along with a real-time kernel, such as RTAI or RTLinux, since hard real-time response times are necessary to control complex hardware. Hardware control will therefore be carried out by custom, hard real-time drivers. Here too, no outside networking services are provided. Custom daemons communicate with the SYSM to coordinate system behavior. Because the control module is not a multiuser system and has no direct user interaction, only minimal user tools will be available. BusyBox may be used. The control module also uses a fixed IP address for the same reason as the DAQ module.

System management module

The SYSM module manages and coordinates the interactions between the different components of the system, while providing a point of entry into the system to the outside world, as mentioned earlier. It is a large embedded system with stringent time constraints and no user interface. It contains three network adapters: one for DAQ and control, one for user interfaces, and one for the outside network. Each networking interface has its set of rules and services.

On link A, the SYSM module retrieves data from the DAQ module, stores all or parts of it, and forwards pertinent data to the various UIs for display. The stored data can be backed up for future reference and may form the base of a quality control procedure. The data can be backed up either by means of conventional backup or using a database that has a backup procedure. As said earlier, the SYSM module may carry out analysis on acquired data if this isn’t done on the DAQ module. Whether the analysis is done locally or on the DAQ module, the SYSM module will issue commands to the control module according to that analysis and according to the current state of the controlled process. The SYSM module runs custom daemons and utilities that complement the daemons present on the DAQ module and control module to communicate with them appropriately. As with the other elements on link A, the SYSM module has a fixed IP address so the DAQ and control modules can identify it easily.

To the outside network, the SYSM module provides HTTP and SSH services. The HTTP service enables authorized users on the outside network to configure or monitor various aspects of the complete system through the use of web pages and forms. The SSH services make it possible for the embedded system’s manufacturer to log into the system from a remote site for troubleshooting and upgrades. The availability of an SSH server on such a large system reduces maintenance cost for both the manufacturer and the client.

One of the configurable options of the SYSM module is the way errors are reported to the outside world. This indicates to the SYSM what it should do with an error it cannot handle, such as the failure of the DAQ or control module. The standard procedure may be to signal an alarm on a loudspeaker, or it may involve using SNMP to signal the system operator or simply sending a critical display request to the appropriate UI module. The link to the outside world is another configurable option. The SYSM module may either have a fixed IP address or retrieve its IP address using DHCP or BOOTP.

On link B, the SYSM module offers DHCP services so the UIs can dynamically allocate their addresses. Once UIs are up and have registered themselves with the SYSM, it will forward them the data they are registered to display, along with the current system state, and will react to value changes made in a UI according to the system’s state. In the course of system operation, workers can modify the amount of data displayed according to their needs, and the SYSM module will react accordingly by starting or ceasing to forward certain data to the UIs.

As the SYSM module is a large embedded system, it will boot off a hard disk and use the full features made available to a conventional workstation or server including swapping. The server may be an a Sun, a PowerPC, an ARM, or a conventional PC. It makes little difference which type of architecture is actually used for the SYSM module, since most of its functionality is fairly high level. Because it needs to serve many different applications in parallel while answering rapidly to incoming traffic, the SYSM module runs a kernel configured for preemption. Also, as it serves as a central point for management, it is a multiuser system with an extensive user toolset. The root filesystem on the SYSM module will look similar to the one found on common workstations and servers. In fact, we may even install a conventional server distribution on the SYSM module and configure it to our needs.

User interface modules

The user interface modules enable workers to interact with the ongoing process by viewing values that reflect the current status and modifying variables that control the process. The user interfaces are typically small embedded systems with mild time constraints. They too are network enabled, but in various ways. In contrast to the previous system components covered earlier, user interface modules can have various incarnations. Some can be fixed and attached close to a sensitive post of process control. Others can be portable and may be used by workers to walk around the processing plant and enter or retrieve various data. After all, some aspects of the controlled process may not be automated and may need to be entered by hand into the system.

The values displayed by the various UIs are retrieved from the SYSM module by communication with the appropriate custom daemons running on it. As UIs may receive critical events to display immediately, custom daemons run on the UI devices awaiting critical events sent from the SYSM module. The user can choose which variables she wants to view, or the data set may be prefixed, all depending on the purpose and the type of worker using the UI. In any case, some messages, such as critical events, will be displayed regardless of the configuration. Some UIs may display only limited data sets, while others may display detailed information regarding the ongoing process. On some UI modules, it is possible to engage in emergency procedures to handle a critical situation.

As UI modules are small, they typically boot from native flash or through the network. In the later case, the SYSM module has to be configured to accommodate remote boot. Whether remote boot is used or not, the UI modules all obtain their IP addresses via DHCP. Portable UI modules are typically based on ARM, MIPS, or m68k architectures and run standard kernels. As the UI modules are used to interact with the user in an automated fashion, only minimal user tools are necessary, although extensive graphical utilities and libraries are required to accommodate the display and the interaction with the user. Since we assume that anyone on the premises has access to the UI modules, we do not implement any form of authentication on the UI devices, and hence all UI modules are not multi-user systems. This, though, could change depending on system requirements.

Variations in Requirements

The description of the various modules given above is only a basic scheme by which to implement the system. Many variations can be made to the different components and the architecture of the system. Here is a list of such variations in no particular order:

  • Depending on the physical conditions where the system is deployed, it may be necessary to constantly verify the connectivity of the various system components. This would be achieved by a keepalive signal transmitted from the modules to the SYSM module or using watchdogs.

  • Using TCP/IP over Ethernet on link A may pose some problems if reactions to some critical conditions need to be carried out in a deterministic time frame. If a certain chemical reaction being observed by the DAQ module shows signs of imminent disaster, the SYSM module may need to be notified before the chemical reaction goes out of control. In those cases, it may be a good idea to use RTNet, which provides hard real-time UDP over Ethernet.[10] This would necessitate running a real-time kernel on the SYSM module.

  • Ethernet is not fit for all environments. Some other protocols are known to be more reliable in industrial environments. If need be, the designers may wish to replace Ethernet with one of the known industrial networking interfaces, such as RS485, DeviceNet, ARCnet, Modbus, Profibus, or Interbus.

  • For compactness and speed, designers may wish to implement the DAQ, control, and SYSM modules in a single physical device, such as a CompactPCI chassis with a separate card for each module.

  • For management purposes, it may be simpler to implement the UI modules as X terminals. In this configuration, the UI modules would act only as display and input terminals. All computational load would be carried out on the SYSM module, which would be the X application host.

  • If the system is not very large and the process being controlled is relatively small, it may make sense to combine the DAQ, control, and SYSM modules into a single sufficiently powerful computer.

  • If one network link isn’t sufficient for the traffic generated by the DAQ module, it may make sense to add another link that would be dedicated to data transfers only.

  • Since it is more and more frequent to keep process data for quality assurance purposes, the SYSM module may run a database. This database would store information regarding the various operations of the system along with data recorded by the DAQ module.

Other variations are also possible, depending on the system’s requirements.

Design and Implementation Methodology

Designing and implementing an embedded Linux system can be carried out in a defined manner. The process includes many tasks, some of which may be carried out in parallel, hence reducing overall development time. Some tasks can even be omitted, if a distribution is being used. Regardless of the actual tools or methodology you use, Chapter 2 is required reading for all tasks involved in building an embedded Linux system.

While designing and implementing your embedded Linux system, use the worksheet provided in Appendix A to record your system’s characteristics. It includes a section to fully describe each aspect of your embedded system. This worksheet will help your team keep track of the system’s components and will help future maintainers understand how the system was originally built. In fact, a properly completed worksheet should be sufficient for people outside your team to rebuild the entire system without any assistance.

Given that the details of the tasks involved in building embedded Linux systems sometimes change with the updating of the software packages involved, visit this book’s web site (http://www.embeddedtux.org/) from time to time for updates.

Creating a Target Linux System

A target Linux system is created by configuring and bundling together the appropriate system components. Programming and development aspects are a separate subject, and are discussed later in this chapter.

There are four main steps to creating a target Linux system:

  • Determine system components

  • Configure and build the kernel

  • Build root filesystem

  • Set up boot software and configuration

Determining system components is like making a shopping list before you go to the grocery store. It is easy to go without a shopping list and wonder at all the choices you have, as many do with Linux. This may result in “featurism,” whereby your system will have lots and lots of features but won’t necessarily fulfill its primary purpose. Hence, before you go looking at all the latest Linux gizmos available, sit down and write a list of what you need. I find this approach helps in focusing development and avoids distractions such as: “Look honey, they actually have salami ice cream.” This doesn’t mean that you shouldn’t change your list if you see something pertinent. It is just a warning about the quantity of software available for Linux and the inherent abundance of choices.

Chapter 3 discusses the hardware components that can be found as part of an embedded Linux system. This should provide you with enough background and maybe even ideas of what hardware you can find in an embedded Linux system. As Linux and surrounding software are ever evolving targets, use this and further research on the Net to find out which design requirements are met by Linux. In turn, this will provide you with a list of items you need to develop to complete your system. This step of development is the only one that cannot be paralleled with other tasks. Determining system requirements and Linux’s compliance to these requirements has to be completed before any other step.

Because of the ever evolving nature of Linux, you may feel the need to get the latest and greatest pieces of software for your design. Avoid doing this, as new software often needs testing and may require other software to be upgraded because of the dependencies involved between packages. Hence, you may find yourself locked in a frantic race to keep up with the plethora of updates. Instead, fix the bugs with the current software you have and keep track of other advances so that the next generation projects you design can profit from these advances. If you have an important reason to upgrade a software component, carefully analyze the consequences of such an upgrade on the rest of your system before actually carrying out the upgrade. You may also want to test the upgrade on a test system before applying it to your main system.

Having determined which features are pertinent to your design, you can select a kernel version and relevant configuration. Chapter 5 covers the configuration and build process of the kernel. Unlike other pieces of software, you may want to keep updating your kernel to the latest stable version throughout your project’s development up until the beta stage. Though keeping the kernel version stable throughout the development cycle may seem simple, you may find yourself trying to fix bugs that have been fixed in more recent kernels. Keeping yourself up to date with recent kernel developments, as we discuss in Chapter 5, will help you decide whether updating to the most recent kernel is best for you. Also, you may want to try newer kernels and roll back to older ones if you encounter any serious problems. Note that using kernels that are too old may cut you off from community support, since contributors can rarely afford keep answering questions about old bugs.

Regardless of whether you decide to follow kernel updates, I suggest you keep the kernel configuration constant throughout the project. This will avoid completed parts from breaking in the course of development. This involves studying the configuration options closely, though, in light of system requirements. Although this task can be conducted in parallel with other tasks, it is important that developers involved in the project be aware of the possible configuration options and agree with the options chosen.

Once configuration is determined, it is time to build the kernel. Building the kernel involves many steps and generates more than just a kernel image. Although the generated components are not necessary for some of the other development aspects of the project, the other project components tend to become more and more dependent on the availability of the kernel components as the project advances. It is therefore preferable to have the kernel components fully configured and built as early as possible, and kept up to date throughout the project.

In parallel to handling the kernel issues, you can start building the root filesystem of the embedded system, as explained in Chapter 6. The root filesystem of an embedded Linux system is similar to the one you find on a workstation or server running Linux, except that it contains only the minimal set of applications, libraries, and related files needed to run the system. Note that you should not have to remove any of the components you previously chose at this stage to obtain a properly sized root filesystem. In fact, if you have to do so, you probably did not determine system components adequately. Remember that this earlier stage should include an analysis of all system requirements, including the root filesystem size. You should therefore have as accurate as possible an estimate of the size of each component you selected during the first step of creating the target system.

If you are unable to predetermine the complete list of components you will need in your embedded system and would rather build your target root filesystem iteratively by adding the tools and libraries you need as you go along, then do so, but do not treat the result as your final root filesystem. Instead, use the iterative method to explore the building of root filesystems and then apply your experience into building a clean root filesystem for your target system. The reason behind this is that the trial and error nature of the iterative method makes its completion time nondeterministic. The structured approach may require more forethought, but its results are known and can be the basis for additional planning.

Setting up and configuring the storage devices and the bootloader software are the remaining tasks in creating a target Linux system. Chapters Chapter 7, Chapter 8, and Chapter 9 discuss these issues in full. It is during these steps that the different components of the target system come together: the bootloader, the root filesystem, and the kernel. As booting is highly dependent on the architecture, different bootloaders are involved. Within a single architecture there are also variations in the degree of debugging and monitoring provided by the bootloaders. The methodology to package and boot a system is fairly similar among the different architectures, but varies according to the permanent storage device from which the system is booted and the bootloader used. Booting a system from native flash, for instance, is different from booting a system from a DiskOnChip or CompactFlash device, and is even more different from booting from a network server.

Setting Up and Using Development Tools

Software development for embedded systems is different from software development for the workstation or server environments. Mainly, the target environment is often dissimilar to the host on which the development is conducted. Hence the need for a host/target setup whereby the developer develops his software on the host and downloads it onto the target for testing. There are two aspects to this setup: development and debugging. Such a setup, however, does not preclude you from using Linux’s multiarchitecture advantage to test your target’s applications on your host with little or no modification. Though not all applications can be tested in this way, testing target applications on the host will generally save you a lot of time.

Embedded development is discussed in Chapter 4. Prior to testing any code on the target system, it is necessary to establish a host/target connection. This will be the umbilical cord by which the developer will be able to interact with the target system to verify whether the applications he develops function as prescribed. As the applications cannot typically run on bare hardware, there will have to be a functional embedded Linux system on the target hardware already. Since it is often impossible to wait for the final target setup to be completed to test target applications, you can use a development target setup. The latter will be packaged much more loosely and will not have to respect the size requirements imposed on the final package. Hence, the development root filesystem may include many more applications and libraries than will be found in the final root filesystem. This also allows different and larger types of permanent storage devices during development.

Obtaining such a setup necessitates compiling the target applications and libraries. This is achieved by configuring or building the various compiler and binary utilities for cross-development. Using these utilities, you can build applications for the target and therefore build the development target setup used for further development. With this done, you can use various Integrated Development Environments (IDEs) to ease development of the project components and other tools such as CVS to coordinate work among developers.

Given the horsepower found on some embedded systems, some developers even choose to carry out all development directly on the target system. In this setup, the compiler and related tools all run on the target. This, in effect, combines host and target in a single machine and resembles conventional workstation application development. The main advantage of such a configuration is that you avoid the hassle of setting up a host/target environment.

Whatever development setup you choose, you will need to debug and poke at your software in many ways. You can do this with the debugging tools covered in Chapter 11. For simple debugging operations, you may choose to use ad hoc methods such as printing values using printf( ). Some problems require more insight into the runtime operations of the software being debugged; this may be provided by symbolic debugging. gdb is the most common general-purpose debugger for Linux, but symbolic debugging on embedded systems may be more elaborate. It could involve such things as remote serial debugging, kernel debugging, and BDM and JTAG debugging tools. But even symbolic debugging may be inadequate in some situations. When system calls made by an application are problematic or when synchronization problems need to be solved, it is better to use tracing tools such as strace and LTT. For performance problems, there are other tools more adapted to the task, such as gprof and gcov. When all else fails, you may even need to understand kernel crashes.

Developing for the Embedded

One of the main advantages of using Linux as an embedded OS is that the code developed for Linux should run identically on an embedded target as on a workstation, right? Well, not quite. Although it is true that you can expect your Linux workstation code to build and run the same on an embedded Linux system, embedded system operations and requirements differ greatly from workstation or server environments. Whereas you can expect errors to kill an application on a workstation, for instance, leaving the responsibility to the user to restart the application, you can’t afford to have this sort of behavior in an embedded system. Neither can you allow applications to gobble up resources without end or behave in an untimely manner.[11] Therefore, even though the APIs and OS used may be identical, there are fundamental differences in programming philosophies.

Networking

Networking enables an embedded system to interact with and be accessible to the outside world. In an embedded Linux environment, you have to choose networking hardware, networking protocols, and the services to offer while accounting for network security. Chapter 10 covers the setup and use of networking services such as HTTP, Telnet, SSH, and/or SNMP. One interesting aspect in a network-enabled embedded system is the possibility of remote updating, whereby it is possible to update the system via a network link without on-site intervention. This is covered in Chapter 8.



[1] It would be tempting to call these “host distributions,” but as you’ll see later, some developers choose to develop directly on their target, hence the preference for “development distributions.”

[2] Though this project used M-Systems’ binary drivers, there are GPL’d drivers for the DOC, as we’ll see in Chapter 7.

[3] WindRiver has since changed its mind and its relationship with BSD seems to be a matter of the past.

[4] “Free” as in “free speech,” not “free beer.” As Richard Stallman notes, the confusion is due to the English language, which makes no difference between what may be differentiated in other languages such as French as “libre” and “gratuit.” In effect, “free software” is translated to “logiciel libre” in French.

[5] The date was selected purposely in symbolic commemoration of the infamous Halloween Documents uncovered by Eric Raymond. If you are not familiar with these documents and their meaning, have a look at http://www.opensource.org/halloween/.

[6] The licenses are often stored in a file called COPYING, for the GPL, and a file called COPYING.LIB, for the LGPL. Copies of these files are likely to have been installed somewhere on your disk by your distribution.

[7] The specific wording of the GPL to designate this code is the following: “The source code for a work means the preferred form of the work for making modifications to it.” Delivering binaries of an obfuscated version of the original source code to try circumventing the GPL is a trick that has been tried before, and it doesn’t work.

[8] An open source distribution is a distribution that is maintained by the open source community, such as Debian. Inherently, such distributions do not contain any proprietary software.

[9] Though they are not used in this example, off-the-shelf Ethernet-enabled DAQ devices are readily available.

[10] Though UDP does not delay packet transfers as TCP does, the standard TCP/IP stack in Linux is not hard real time. RTNet provides hard real-time network communication by providing a UDP stack directly on top of RTAI or RTLinux.

[11] Normal Linux workstation and server applications should not gobble up resources either. In fact, the most important applications used on Linux servers are noteworthy for their stability, which is one reason Linux is so successful as a server operating system.

Get Building Embedded Linux Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.