Chapter 4. Functional Concurrency

Now that we have discussed functional data structures and algorithms, let’s return to the topic that has sparked widespread interest in functional programming in the first place: concurrency. Recall this warning from Chapter 1:

Warning

Multithreaded programming, requiring synchronized access to shared, mutable state, is the assembly language of concurrency.

We’ve already discussed how immutable values make synchronization unnecessary. Yet, mutating state is never completely avoidable. Let’s examine two higher-level abstractions that provide “principled” ways to manage mutable state in thread-safe ways: Actors and Software Transactional Memory.

The Actor Model

The Actor model isn’t really a functional approach to concurrency, but it fits our general goal of managing state mutation in principled ways. In the Actor model of concurrency, work is coordinated by message passing between “actors.” Each actor has a queue, sometimes called a mailbox, for incoming messages. The actor processes each message, one at a time. Carl Hewitt and collaborators developed the actor model almost 40 years ago [Hewitt1973]. [Agha1987] provides a complete description of the theory of actors. Perhaps the best known implementation of actors is found in Erlang, where actors are the core of everything you do in the language.

It’s interesting to note that Alan Kay’s original vision for objects in Smalltalk is much closer to the actor model than it is to the objects found in most languages [Kay1998]. For Kay, “The big idea is messaging.” He also believed that state changes should be encapsulated and not done in an unconstrained way.

This metaphor of passing messages between objects is not only intuitive, it helps clarify boundaries between objects. Have you seen code where one object makes lots of little calls to other objects to get bits of information? How would you change that code if you thought in terms of message passing, instead?

In an actor system, state mutation is handled one of several ways. For some state, it can be the responsibility of one actor to mutate that state. No other code is permitted to do so. When a mutation is required, a message is sent to the actor, which performs all such changes sequentially, thereby avoiding synchronization problems.

A similar model is to allow multiple actors to modify the same state, but only one at a time. A special “semaphore” message is exchanged that tells the receiver that it is safe to modify the state. When finished, the semaphore is sent to another actor.

Both cases run the risk of creating a bottleneck if the scope of responsibility is too large. It might be necessary to break it down into smaller, “isolated” sections.

Fortunately, good actor libraries are available for most languages. Perhaps the best option for Java is the Akka Java API [Akka]. An alternative is also available in [Functional Java].

Here is a simple actor-based program that remembers every string passed to it, keeping the string and the time it was seen in a map:

package actors;
import akka.actor.*;
import static akka.actor.Actors.*;
import java.util.*;

public class AkkaActorExample {
  // server code
  static class MemoryActor extends UntypedActor {
    final Map<String,Date> seen = new HashMap<String,Date>();

    public void onReceive(Object messageObject) {
      String message = messageObject.toString(); // simplifying assumption
      if (message.equals("DUMP")) {
        getContext().replySafe(seen.toString());
      } else {
        Date date = new Date();
        seen.put(message.toString(), date);
        getContext().replySafe("'" + message + "' recorded at " + date);
      }
    }
  }

  public static void main(String[] args) {
    ActorRef remActor = actorOf(MemoryActor.class).start();
    for (String arg: args) {
      // client code
      Object response = remActor.sendRequestReply(arg);
      System.out.println("Reply received: "+response);
    }
    Object response = remActor.sendRequestReply("DUMP");
    System.out.println("Dump of remembered strings: "+response);
    System.exit(0);
  }
}

For convenience, everything is wrapped in a class, AkkaActorExample, which also defines the main method. The MemoryActor extends Akka’s UntypedActor, so named because the messages are of type Object.

MemoryActor implements an onReceive method, declared abstract by UntypedActor, which is called whenever a new message is received by the actor. This handler stores the input message (basically assuming it is a string, for simplicity) and the current time in a mutable map. It replies to the caller that the message was recorded.

However, if a special message DUMP is received, the actor replies with a “dump” of the current state of the map. Note that the actor manages the mutable state and prevents any other code from accessing it. Even the DUMP message returns a string, rather than the map itself.

The main method uses the Akka idiom for instantiating an actor of instance MemoryActor and wrapping it in an ActorRef, which is returned to main. Akka separates the actor instance from references to it, an example of the Bridge design pattern [GOF1995]. Akka does this so that if an actor instance fails for some reason, it can be restarted without requiring clients to acquire a new reference to the new actor. This is an example of the extensive robustness and error recovery features in Akka’s Actor library, which were inspired by similar capabilities in Erlang.

Once main has an actor reference, it loops through the input arguments and sends each word, one at a time, to the actor. It then prints the response received. At the end, it sends the DUMP message.

To keep the example simple, synchronous calls and responses are used, where the code waits for a reply after each message is sent. Normally, you would use asynchronous messages for better scalability, which Akka supports.

If you download the code examples and build the actor.example make target, it runs this code with the arguments I am a Master Thespian!. Here is the output (omitting some Akka informational messages):

Reply received: 'I' recorded at Sat Jun 25 16:14:43 CDT 2011
Reply received: 'am' recorded at Sat Jun 25 16:14:43 CDT 2011
Reply received: 'a' recorded at Sat Jun 25 16:14:43 CDT 2011
Reply received: 'Master' recorded at Sat Jun 25 16:14:43 CDT 2011
Reply received: 'Thespian!' recorded at Sat Jun 25 16:14:43 CDT 2011
Dump of remembered strings: {
  am=Sat Jun 25 16:14:43 CDT 2011,
  a=Sat Jun 25 16:14:43 CDT 2011,
  Master=Sat Jun 25 16:14:43 CDT 2011,
  Thespian!=Sat Jun 25 16:14:43 CDT 2011,
  I=Sat Jun 25 16:14:43 CDT 2011}

I wrapped the long line for the “Dump” output. Note that creating the string for the map required iterating through it, which doesn’t preserve insertion order, as you would expect.

This example just scratches the surface of what you can do with Akka Actors (as well as other Actor libraries), including distributing actors remotely, managing their life cycles, handling crash recovery, etc. See [Akka] for more details.

Software Transactional Memory

Chances are you’ve worked on an application with a database backend. A key feature of most relational databases is support for ACID transactions, an acronym for atomicity, consistency, isolation, and durability.[7] The goal of ACID transactions is to avoid logical inconsistencies in a given set of related records, for example where two simultaneous updates leave the set of records in an inconsistent state, or updates are made that are based on stale data, which could effectively erase more recent updates.

Software Transactional Memory (STM) brings transactions to locations in memory that are referenced by variables [STM] (see also [PeytonJones2007]). STM can’t provide durability, because memory isn’t durable (e.g., if the power is lost), but STM can provide the ACI, atomicity, consistency, and isolation in ACID.

The model in STM is to separate references to values from the values themselves. We saw this principle at work in Akka actors. In STM, a program has a reference to a value of interest. The STM framework provides a protocol for changing the value to which the reference “points.”

However, values themselves are not changed. They remain immutable. Only the references change to point to new values. We saw in Persistent Data Structures how the appropriate choice of implementation can provide an efficient way to make a new value from a large object without copying the parts of it that aren’t changing. Rather, those parts are shared between the old and new version of the object. Persistent Data Structures are exactly what STM needs.

Figure 4-1 shows the state at time “0.” There are two references pointing the same value1 of a persistent data structure, adapted from Figure 3-1 in the previous chapter.

Time 0, one value with two references to it

Figure 4-1. Time 0, one value with two references to it

Now let’s change ref2 to point to a new, updated value, as shown in Figure 4-2.

Time 1, two values, with one reference to each

Figure 4-2. Time 1, two values, with one reference to each

By time “1,” an STM transaction in the context of ref2 has been used to move its reference to value2, which was created from value1, as indicated by the dotted line. Creating value2 does not necessary have to occur within the transaction, just the reassignment of ref2 (but see the example below). Note that ref1 still points to the old value, value1.

This behavior allows different clients to acquire references to the same value at a particular time, but each can work with the value without fear that it will change unexpectedly, due to the actions of one of the other clients. Recall that a history of the evolving values is effectively maintained, as long as there are references pointing to multiple versions. A version with no references will be garbage collected.

So that’s how STM works behind the scenes. What’s it like for a client to use STM?

There are several STM libraries for Java, many of which are inspired by Clojure’s implementation. Akka integrates with the [Multiverse STM]. Below is a simple example adapted from the Akka documentation [Akka]. A reference to an Integer value is managed using the techniques described above:

// Adapted from Akka example source code.
// Copyright (C) 2009-2011 Scalable Solutions AB <http://scalablesolutions.se>
package stm;
import akka.stm.*;

public class AkkaSTMIntegerCounter {

  private final Ref<Integer> ref = new Ref<Integer>(0);

  public int counter() {
    return new Atomic<Integer>() {
      public Integer atomically() {
        int inc = ref.get() + 1;
        ref.set(inc);
        return inc;
      }
    }.execute();
  }

  public static void main(String[] args) {
    AkkaSTMIntegerCounter counterRef = new AkkaSTMIntegerCounter();
    System.out.println(counterRef.counter());   // -> 1
    System.out.println(counterRef.counter());   // -> 2
  }
}

First, a typed reference, Ref<Integer>, is created with the initial value of zero. Then, a helper method counter handles incrementing the value and returning the new value. The mutation and update of the reference must be enclosed in an Atomic<Integer> object (analogous to synchronizing a method). The Ref.get method retrieves the current value and the Ref.set method sets a new value. Note that wrapping these steps in Atomic prevents updates using potentially stale values from calls to get.

The main method instantiates an AkkaSTMIntegerCounter object, then calls counter twice and prints the results. The numbers 1 and 2 will be printed on separate lines.

For a beautiful exposition on STM, see [PeytonJones2007].

Exercises

  1. Using the [Akka] documentation for actors, modify the Actor example to make calls asynchronously. For example, create several actors that send messages to MemoryActor and add an actor that main uses to receive the replies.

  2. Use the Akka/Multiverse API to manage a more complex object, like a collection.



[7] One of the big data trends is to use new kinds of databases that relax this constraint in order to improve throughput and availability.

Get Functional Programming for Java Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.