O'Reilly logo

Erlang Programming by Francesco Cesarini, Simon Thompson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Race Conditions, Deadlocks, and Process Starvation

Anyone who has programmed concurrent applications before moving to Erlang will have his favorite horror stories on memory corruption, deadlocks, race conditions, and process starvation. Some of these conditions arise as a result of shared memory and the need for semaphores to access them. Others are as a result of priorities. Having a “no shared data” approach, where the only way for processes to share data is by copying data from one process to another, removes the need for locks, and as a result, the vast majority of bugs related to memory corruption deadlocks and race conditions.

Problems in concurrent programs may also arise as a result of synchronous message passing, especially if the communication is across a network. Erlang solves this through asynchronous message passing. And finally, the scheduler, the per-process garbage collection mechanisms, and the massive level of concurrency that can be supported in Erlang systems ensure that all processes get a relatively fair time slice when executing. In most systems, you can expect a majority of the processes to be suspended in a receive statement, waiting for an event to trigger a chain of actions.

That being said, Erlang is not completely problem-free. You can avoid these problems, however, through careful and well-thought-out design. Let’s start with race conditions. If two processes are executing the following code in parallel, what can go wrong?

start() ->
   case whereis(db_server) of
     undefined ->
        Pid = spawn(db_server, init, []),
        register(db_server, Pid),
        {ok, Pid};
     Pid when is_pid(Pid) ->
        {error, already_started}
end.

Assume that the database server process has not been started and two processes simultaneously start executing the start/0 function. The first process calls whereis(db_server), which returns the atom undefined. This pattern-matches the first clause, and as a result, a new database server is spawned. Its process identifier is bound to the variable Pid. If, after spawning the database server, the process runs out of reductions and is preempted, this will allow the second process to start executing. The call whereis(db_server) by the second process also returns undefined, as the first process had not gotten as far as registering it. The second process spawns the database server and might go a little further than the first one, registering it with the name db_server. At this stage, the second process is preempted and the first process continues where it left off. It tries to register the database server it created with the name db_server but fails with a runtime error, as there already is a process with that name. What we would have expected is the tuple {error, already_started} to be returned, instead of a runtime error. Race conditions such as this one in Erlang are rare, but they do happen when you might least expect them. A variant of the preceding example was taken from one of the early Erlang libraries and reported as a bug in 1996.

A second potential problem to keep in mind involves deadlocks. A good design of a system based on client/server principles is often enough to guarantee that your application will be deadlock-free. The only rule you have to follow is that if process A sends a message and waits for a response from process B, in effect doing a synchronous call, process B is not allowed, anywhere in its code, to do a synchronous call to process A, as the messages might cross and cause the deadlock. Deadlocks are extremely rare in Erlang as a direct result of the way in which programs are structured. In those rare occasions where they slip through the design phase, they are caught at a very early stage of testing.[15]

By calling the BIF process_flag(priority, Priority), where Priority can be set to the atom high, normal, or low, the behavior of the scheduler can be changed, giving processes with higher priority a precedence when being dispatched. Not only should you use this feature sparingly; in fact, you should not use it at all! As large parts of the Erlang runtime system are written in Erlang running at a normal priority, you will end up with deadlocks, starvation, and in extreme cases, a scheduler that gives low-priority processes more CPU time than its high-priority counterparts. With SMP, this behavior becomes even more non-deterministic. Endless flame wars and arguments regarding process and priorities have been fought on the Erlang-questions mailing list, deserving a whole chapter on the subject. We will limit ourselves to saying that under no circumstances should you use process priorities. A proper design of your concurrency model will ensure that your system is well balanced and deterministic, with no process starvation, deadlocks, or race conditions. You have been warned!



[15] In 15 years of working with Erlang on systems with millions of lines of code, Francesco has come across only one deadlock that made it as far as the integration phase.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required