Cover by Francesco Cesarini, Simon Thompson

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Chapter 4. Concurrent Programming

Concurrency is the ability for different functions to execute in parallel without affecting each other unless explicitly programmed to do so. Each concurrent activity in Erlang is called a process. The only way for processes to interact with each other is through message passing, where data is sent from one process to another. The philosophy behind Erlang and its concurrency model is best described by Joe Armstrong’s tenets:

  • The world is concurrent.

  • Things in the world don’t share data.

  • Things communicate with messages.

  • Things fail.

The concurrency model and its error-handling mechanisms were built into Erlang from the start. With lightweight processes, it is not unusual to have hundreds of thousands, even millions, of processes running in parallel, often with a small memory footprint. The ability of the runtime system to scale concurrency to these levels directly affects the way programs are developed, differentiating Erlang from other concurrent programming languages.

What if you were to use Erlang to write an instant messaging (IM) server, supporting the transmission of messages between thousands of users in a system such as Google Talk or Facebook? The Erlang design philosophy is to spawn a new process for every event so that the program structure directly reflects the concurrency of multiple users exchanging messages. In an IM system, an event could be a presence update, a message being sent or received, or a login request. Each process will service the event it handles, and terminate when the request has been completed.

You could do the same in C or Java, but you would struggle when scaling the system to hundreds of thousands of concurrent events. An option might be to have a pool of processes handling specific event types or particular users, but certainly not a new process for every event. Erlang gets away with this because it does not use native threads to represent processes. It has its own scheduler in the virtual machine (VM), making the creation of processes very efficient while at the same time minimizing their memory footprint. This efficiency is maintained regardless of the number of concurrent processes in the system. The same argument applies for message passing, where the time to send a message is negligible and constant, regardless of the number of processes. This chapter introduces concurrent programming in Erlang, letting you in on one of the most powerful concurrency models available today.

Creating Processes

So far, we’ve looked at executing sequential code in a single process. To run concurrent code, you have to create more processes. You do this by spawning a process using the spawn(Module, Function, Arguments) BIF. This BIF creates a new process that evaluates the Function exported from the module Module with the list of Arguments as parameters. The spawn/3 BIF returns a process identifier, which from now on we will refer to as a pid.

In Figure 4-1, the process we call Pid1 executes the spawn BIF somewhere in its program. This call results in the new process with process identifier Pid2 being created. Process identifier Pid2 is returned as a result of the call to spawn, and will typically be bound to a variable in an expression of the following format:

Pid2 = spawn(Module, Function, Arguments).
Before and after calling spawn

Figure 4-1. Before and after calling spawn

The pid of the new process, Pid2, at this point is known only within the process Pid1, as it is a local variable that has not been shared with anybody. The spawned process starts executing the exported function passed as the second argument to the BIF, and the arity of this function is dictated by the length of the list passed as the third argument to the BIF.


A common error when you start programming Erlang is to forget that the third argument to spawn is a list of arguments, so if you want to spawn the function m:f/1 with the argument a, you need to call:

spawn(m, f, [a])


not spawn(m, f, a).

Once spawned, a process will continue executing and remain alive until it terminates. If there is no more code to execute, a process is said to terminate normally. On the other hand, if a runtime error such as a bad match or a case failure occurs, the process is said to terminate abnormally.

Spawning a process will never fail, even if you are spawning a nonexported or even a nonexistent function. As soon as the process is created and spawn/3 returns the pid, the newly created process will terminate with a runtime error:

1> spawn(no_module, nonexistent_function, []).

=ERROR REPORT==== 29-Feb-2008::21:48:29 ===
Error in process <0.32.0> with exit value: 

In the preceding example, note how the error report is formatted. It is different from the ones you saw previously, as the error does not take place in the shell, but in the process with pid <0.32.0>. If the error occurs in a spawned process, it is detected by another part of the Erlang runtime system called the error logger, which by default prints an error report in the shell using the format shown earlier. Errors detected by the shell are instead formatted in a more readable form.

The processes() BIF returns a list of all of the processes running in the system. In most cases, you should have no problems using the BIF, but there have been extreme situations in large systems where calling processes() from the shell has been known to result in the runtime system running out of memory![13] Don’t forget that in industrial applications, you might be dealing with millions of processes running concurrently. In the current implementation of the runtime system, the absolute limit is in the hundreds of millions. Check the Erlang documentation for the latest figures. The default number is much lower, but you can easily change it by starting the Erlang shell with the command erl +P MaxProcceses, where MaxProcesses is an integer.

You can use the shell command i() to find out what the currently executing processes in the runtime system are doing. It will print the process identifier, the function used to spawn the process, the function in which the process is currently executing, as well as other information covered later in this chapter. Look at the example in the following shell printout. Can you spot the process that is running as the error logger?

2> processes().
3> i().
Pid                   Initial Call                          Heap     Reds Msgs
Registered            Current Function                     Stack
<0.0.0>               otp_ring0:start/2                      987     2684    0
init                  init:loop/1                              2
<0.2.0>               erlang:apply/2                        2584    61740    0
erl_prim_loader       erl_prim_loader:loop/3                   5
<0.4.0>               gen_event:init_it/6                    610      219    0
error_logger          gen_event:fetch_msg/5                   11
<0.5.0>               erlang:apply/2                        1597      508    0

If you are wondering why the processes() BIF returned far more than 20 processes when you created only one that failed right after being spawned, you are not alone. Large parts of the Erlang runtime system are implemented in Erlang, the error_logger and the Erlang shell being two of the many examples. You will come across other system processes as you work your way through the remaining chapters of this book.

[13] Partially because the return values of the operations in the shell are cached.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required