Preface

It’s been quite a while since the people from whom we get our project assignments accepted the excuse “Gimme a break! I can only do one thing at a time!” It used to be such a good excuse, too, when things moved just a bit slower and a good day was measured in written lines of code. In fact, today we often do many things at a time. We finish off breakfast on the way into work; we scan the Internet for sports scores and stock prices while our application is building; we’d even read the morning paper in the shower if the right technology were in place!

Being busy with multiple things is nothing new, though. (We’ll just give it a new computer-age name, like multitasking, because computers are happiest when we avoid describing them in anthropomorphic terms.) It’s the way of the natural world—we wouldn’t be able to write this book if all the body parts needed to keep our fingers moving and our brains engaged didn’t work together at the same time. It’s the way of the mechanical world—we wouldn’t have been able to get to this lovely prefabricated office building to do our work if the various, clanking parts of our automobiles didn’t work together (most of the time). It’s the way of the social and business world—three authoring tasks went into the making of this book, and the number of tasks, all happening at once, grew exponentially as it went into its review cycles and entered production.

Computer hardware and operating systems have been capable of multitasking for years. CPUs using a RISC (reduced instruction set computing) microprocessor break down the processing of individual machine instructions into a number of separate tasks. By pipelining each instruction through each task, a RISC machine can have many instructions in progress at the same time. The end result is the heralded speed and throughput of RISC processors. Time-sharing operating systems have been allowing users nearly simultaneous access to the processor for longer than we can remember. Their ability to schedule different tasks (typically called processes) really pays off when separate tasks can actually execute simultaneously on separate CPUs in a multiprocessor system.

Although real user applications can be adapted to take advantage of a computer’s ability to do more than one thing at once, a lot of operating system code must execute to make it possible. With the advent of threads we’ve reached an ideal state—the ability to perform multiple tasks simultaneously with as little operating system overhead as possible.

Although threaded programming styles have been around for some time now, it’s only recently that they’ve been adopted by the mainstream of UNIX programmers (not to mention those erstwhile laborers in the vineyards of Windows NT and other operating systems). Software sages swear at the lunchroom table that transaction processing monitors and real-time embedded systems have been using thread-like abstractions for more than twenty years. In the mid-to-late eighties, the general operating system community embarked on several research efforts focused on threaded programming designs, as typified by the work of Tom Doeppner at Brown University and the Mach OS developers at Carnegie-Mellon. With the dawn of the nineties, threads became established in the various UNIX operating systems, such as USL’s System V Release 4, Sun Solaris, and the Open Software Foundation’s OSF/1. The clash of platform-specific threads programming libraries advanced the need of some portable, platform-independent threads interface. The IEEE has just this year met this need with the acceptance of the IEEE Standard for Information Technology Portable Operating System Interface (POSIX) Part 1: System Application Programming Interface (API) Amendment 2: Threads Extension [C Language]—the Pthreads standard, for short.

This book is about Pthreads—a lightweight, easy-to-use, and portable mechanism for speeding up applications.

Organization

We’ll start off Chapter 1, by introducing you to multithreading as a way of performing the many tasks of a program with greater efficiency and speed than would be possible in a serial or multiprocess design. We’ll then examine the pitfalls of serial and multiprocess programming, and discuss the concept of potential parallelism, the cornerstone of any decision to write a multitasking program. We’ll introduce you to your first Pthreads call—pthread_create—and look at those structures by which a thread is uniquely identified. We’ll briefly examine the ways in which multiple threads in the same process exchange data, and we’ll highlight some synchronization issues.

We’ll continue our discussion of planning and structuring a multithreaded program in Chapter 2. Here, we’ll look at the types of applications that can benefit most from multithreading. We’ll present the three classic methods for distributing work among threads—the boss/worker model, the peer model, and the pipeline model. We’ll also compare two strategies for creating threads—creation on demand versus thread pools. After a brief discussion of thread data-buffering techniques, we’ll introduce the ATM server application example that we’ll use as the proving ground for thread concepts we’ll examine throughout the rest of the book.

In Chapter 3, we’ll look at the tools that the Pthreads library provides to help you ensure that threads access shared data in an orderly manner. This chapter includes lengthy discussions of mutex variables and condition variables, the two primary Pthreads synchronization tools. It also describes reader/writer locks, a more complex synchronization tool built from mutexes and condition variables. By the end of the chapter, we will have added synchronization to our ATM server example and presented most of what you’ll need to know to write a working multithreaded program.

We’ll look at the special characteristics of threads and the more advanced features of the Pthreads library in Chapter 4. We’ll cover some large topics, such as keys (a very handy way for threads to maintain private copies of shared data) and cancellation (a practical method for allowing your threads to be terminated asynchronously without disturbing the state of your program’s data and locks). We’ll cover some smaller topics, such as thread attributes, including the one that governs the persistence of a thread’s internal state. (When you get to this chapter, we promise that you’ll know what this means, and you may even value it!) A running theme of this chapter are the various tools that, when combined, allow you to control thread scheduling policies and priorities. You’ll find these discussions especially important if your program includes one or more real-time threads.

In Chapter 5, we’ll describe how multithreaded programs interact with features of the UNIX operating system that many serial programs take for granted. First, we’ll examine the special challenges UNIX signals pose to multithreaded programs; we’ll look at the types of signals threads must worry about and how you can direct certain signals to specific threads. We’ll then focus on the requirements the Pthreads library imposes on system calls and libraries to allow them to work correctly when multiple threads from the same process are using them at the same time. Finally, we’ll show you what the UNIX fork and exec calls do to threads. (It isn’t always pretty.)

After we’ve dealt with the fundamentals of Pthreads programming in the earlier chapters, we turn to the more basic issues you’ll face in deploying a multithreaded application in Chapter 6. The theme of this chapter is speed. We’ll look at those performance concerns over which you have little control—those that are inherent in a given platform’s Pthreads implementation. Here, we’ll profile the three major ways implementors design a Pthreads-compliant platform, listing the advantages and drawbacks of each. We’ll move on to a discussion of debugging threads, where we’ll illustrate a number of debugging strategies using a thread-capable debugger. Finally, we’ll look at various alternatives for improving our program’s performance. We’ll run some tests on various versions of our ATM server to test their performance as contention and workload increase.

We’ve also included three brief appendixes:

  • Appendix A, shows how a multithreaded program might be written using the Open Software Foundation’s Distributed Computing Environment (DCE).

  • Appendix B, lists the differences between Draft 4 of the Pthreads standard and Draft 10, its final version.

  • Appendix C, is meant to help you find the syntax of any Pthreads library call quickly, without the need for another book.

Example Programs

You can obtain the source code for the examples presented in this book from O’Reilly & Associates through their Internet server.

The example programs in this book are available electronically by FTP.

FTP

To use FTP, you need a machine with direct access to the Internet. A sample session is shown, with what you should type in boldface.

%  ftp ftp.oreilly.com
Connected to ftp.oreilly.com.
220 FTP server (Version 6.21 Tue Mar 10 22:09:55 EST 1992) ready.
Name (ftp.oreilly.com:yourname) : anonymous
331 Guest login ok, send domain style e-mail address as password.
Password: yourname@domain.name (use your user name and host here)
230 Guest login ok, access restrictions apply.
ftp> cd /work/nutshell/pthread
250 CWD command successful.
ftp> binary (Very important! You must specify binary transfer for
compressed files.)
200 Type set to I.
ftp> get examples.tar.gz
200 PORT command successful.
150 Opening BINARY mode data connection for examples.tar.gz.
226 Transfer complete.
ftp> quit
221 Goodbye.
%

The file is a gzip compressed tar archive; extract the files from the archive by typing:

% gzcat examples.tar.gz | tar xvf -

System V systems require the following tar command instead:

% gzcat examples.tar.gz | tar xof -

If gzcat is not available on your system, use separate gunzip and tar or shar commands.

% gunzip examples.tar.gz
% tar xvf examples.tar

Typographical Conventions

The following font conventions are used in this book:

  • Italic is used for function names, filenames, program names, commands, and variables. It’s also used to identify new terms and concepts when they are introduced.

  • Constant Width is used for code examples and for the system output portion of interactive examples.

  • Constant Bold is used in interactive examples to show commands or other text that would be typed literally by the user.

  • Constant Italic identifies programmer-supplied variables in the C language function bindings that appear in Appendix C.

Acknowledgments

First of all, we’d like to thank Andy Oram, our editor at O’Reilly & Associates. He stuck with us through the long haul, and the book benefits from his attentive reviews, technical expertise, and sheer professionalism on this book beyond measure. We’re also indebted to our technical reviewers: Jeff Denham, Bill Gallmeister, and Dean Brock. Jeff, Greg Nichols, and Bernard Farrell read and commented on early drafts of the book. Thank you all!

Brad: “The inspiration for this book came from a threads programming seminar I developed back in 1991 for the Institute for Software Advancement (ISA). I’d like to express my appreciation to Rich Mitchell of ISA and Nick Uginow of DEC for setting me on this track, as well as the good folks at DECwest in Seattle and DEC software engineering in Nashua, New Hampshire, who attended my seminars and helped the course evolve. I’d like to acknowledge the support and encouragement of my former colleagues at DEC: Andy Kegal, Fred Glover, Ed Cande, and Steve Strange. On the personal side, I’d like to acknowledge my grandmother, Natalie Bunker, for the desire to write a book, my wife Susan for supporting me through the long project, and my friend Paul Silva for modeling the determination needed to complete it.”

Dick: “I’d like to thank Kathleen Johnson, Thomas Doeppner, Stan Amway, Cheryl Wiecek, Steve Fiorelli, and Dave Long. Each can lay a claim to some flavor and vintage of threads information I filed away somewhere in my head just in case someone asked. Special thanks to Ruth Goldenberg (the most technical and generous of writers), Mike Etzel, and Howard Littlefield. I want to especially thank Connie, my wife, for her love, patience, and permission to skip this year’s spring cleanup. (Another book for the snow-shovelling season, Brad and Jackie?) Finally, love to my kids: Jenn (who wants a giraffe on the cover), Maggie (a doggie), and Tom (a lobster ... on a pirate’s shoulder... with one leg....).

Jackie: “I’d like to thank Bernard, who is not only a superb technical resource but an absolutely wonderful, supportive husband. I’d also like to thank Mark Sanders and Jonathan Swartz for my first introductions to threads concepts. Thanks also to the whole DECthreads team, and Peter Portante in particular, for helping refine my understanding of the practical matters of programming with Pthreads.”

Get PThreads Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.