When on consulting assignments around the world working with developers coming from a C++ and Java background who have learned Erlang on their own, a common theme we have come across is the use of processes. This theme is irrespective of the experience level of the developers, and of what their system does. Instead of creating a process for every truly concurrent activity in the system, they tend to create one for every task. Programming concurrent applications in Erlang requires a different strategy for processes, which in turn means reasoning in a different way to what one may be accustomed to. The main difference from other concurrent languages is that with Erlang, processes are so cheap it is best to use a process for each truly concurrent activity in your system, not for every task. This case study is from one of Francesco’s first consulting assignments outside of Ericsson, soon after Erlang had been released as open source, and it illustrates clearly what we mean by the difference between a task and an activity.
He worked with a group of engineers who were developing an IM proxy for Jabber. The system received a packet through a socket, decoded it, and took certain actions based on its content. Once the actions were completed, it encoded a reply and sent it to a different socket to be forwarded to the recipient. Only one packet at a time could come through a socket, but many sockets could simultaneously be receiving and handling packets.
As described in Figure 4-13, the original system did not have a process for every truly concurrent activity—the processing of a packet from end to end—but instead used a process for every different task—decoding, encoding, and so forth. Each open socket in Erlang was associated with a process that was receiving and sending data through this socket. Once the packet was received, it was forwarded to a process that handled the decoding. Once decoded, the decoding process forwarded it to the handler that processed it. The result was sent to an encoding process, which after formatting it, forwarded the reply to the socket process that held the open connection belonging to the recipient.
At its best performance, the system could process five simultaneous messages, with the decoding, handling, and encoding being the bottleneck, regardless of the number of simultaneously connected sockets. There were two other processes, one used for error handling, where all errors were passed, and one managing a database, where data reads, writes, and deletes were executed.
When reviewing the system, we identified what we believed was a truly concurrent activity in the system. It was not the action of decoding, handling, and encoding that was the answer, but the handling of the individual packets themselves. Having a process for every packet and using that process to decode, handle, and encode the packet meant that if thousands of packets were received simultaneously, they would all be processed in parallel. Knowing that a socket can receive only one packet at any one time meant that we could use this socket process to handle the call. Once the packet was received, a function call ensured that it was decoded and handled. The result (possibly an error) was encoded and sent to the socket process managing the connection belonging to the final recipient. The error handler and database processes were not needed, as the consistency of data through the serialization of destructive database operations could have been achieved in the handler process, as could the management of errors.
If you look at Figure 4-14, you will notice that on top of the socket processes, a database process was added to the rewritten program. This was to ensure that data consistency was maintained, as many processes accessing the same data might corrupt it as a result of a race condition. All destructive database operations such as write and delete were serialized through this process. Even if you can execute most of your activities in parallel, it is essential to identify activities that need serializing and place them in a process of their own. By taking care in identifying truly concurrent activites in your Erlang system, and spawning a process for each, you will ensure that you maximize the throughput while reducing the risk of bottlenecks.