What is Erlang-Style Concurrency?

Every now and then, a blog article appears on how to do “Erlang-Style Concurrency” in language X. I think this is a very good thing, as it indicates that people are starting to take a long hard look at concurrency.

However, there doesn’t seem to be any authoritative definition of the term “Erlang-Style Concurrency”. I thought I’d at least try to give my view.

Joe Armstrong writes the following in his book, Programming Erlang:

“In Erlang:

  • Creating and destroying processes is very fast.
  • Sending messages between processes is very fast.
  • Processess behave the same way on all operating systems.
  • We can have very large numbers of processes.
  • Processes share no memory and are completely independent.
  • The only way for processes to interact is through message passing.”

I guess this is about as concise and authoritative definition as any. However, it’s not the whole story.
I would add the following items:

  • Message passing is asynchronous.
  • Processes can monitor each other.
  • It is possible to selectively receive messages.
  • Remote processes appear largely the same as local processes.

So this roughly describes how concurrency works in Erlang. Are all these bullets necessary for Erlang-style Concurrency? Perhaps not.

I would pick the following items as necessary for Erlang-style Concurrency:

  • Fast process creation/destruction
  • Ability to support >> 10 000 concurrent processes with largely unchanged characteristics.
  • Fast asynchronous message passing.
  • Copying message-passing semantics (share-nothing concurrency).
  • Process monitoring.
  • Selective message reception.

A few words to explain the above:

Speed and scalability
In order for concurrency to be useful as a fundamental modelling paradigm, the programmer must feel that processes are cheap enough you can efficiently create as many processes as the problem calls for. If there is any single defining characteristic of Erlang-style Concurrency, it is that you should be able to model your application after the natural concurrency patterns present in your problem. If creating a process is perceived as expensive, programmers will reuse existing processes instead; if message passing is perceived to be expensive, various techniques will be invented to avoid sending messages. These workarounds are nearly always detrimental.

There will always be limits. Erlang was designed for agent-style concurrency; not for massive data parallelism. You can “only” have ca 120 million concurrent processes in Erlang, provided you have enough memory. Personally, I’ve verified that Erlang performs consistently up to 20 million concurrent processes (I didn’t have enough memory to go further.) The cost of creating an Erlang process (ca 5-10 us) will be prohibitive for some applications, but it’s at least an order of magnitude cheaper than creating UNIX processes or POSIX threads. It’s reasonable to expect Erlang-style Concurrency to use roughly the same process granularity as Erlang does.

Asynchronous message passing
It has been argued that synchronous message passing is a better building block than asynchronous message passing. It is probably true that synchronous message passing is much easier to reason about, but asynchronous communication (“send and pray”) feels more intuitive in a distributed environment (in which case environments based on synchronous message passing will resort to some form of asynchronous communication as well). Anyway, it is reasonable to argue that asynchronous communication is a defining characteristic of Erlang-style Concurrency.

Copying semantics
Note that this doesn’t necessarily mean that messages must be copied, but they must act as if they were. There are several reasons why this is important:

  • For reliability reasons, processes must not share memory
  • In the distributed case, copying is inevitable, and we want as similar semantics as possible between local and distributed message passing.

It is perhaps too strong to require that processes share nothing, as this would rule out Erlang-style concurrency in most existing languages (even languages like Scala make it possible to share data between processes). Let’s just say that the Erlang style is to make it easier to use share-nothing concurrency than to use sharing.

Process monitoring
This is the enabler of “out-of-band” error handling, or the “let it crash” philosophy, which is very typical for Erlang. The idea is that you program for the correct case, and rely on supervising processes for error handling. This has turned out to be a beautiful way of handling fault tolerance in large systems.

Selective message reception
There are many ways to achieve selective message reception, but for complex multi-way concurrency, you have to support at least one of them. This is something often overlooked when approaching concurrency. It’s not obviously needed in simple programs, but when it is needed, you will suffer complexity explosion if you don’t have it.

I would suggest that ability to support selective message reception is sufficient for Erlang-style concurrency. It can be done either by matching on a single message queue (Erlang, OSE Delta), or by using different mailboxes/channels (Haskell, .NET, UNIX sockets). Pick one, and find a suitable example to experiment with. You may use my presentation “Structured Network Programming” (also attached as pdf) as inspiration, if you like.

47 thoughts on “What is Erlang-Style Concurrency?

  1. Pingback: Pages tagged "languages"

  2. Pingback: Wiger on Erlang-style Concurrency :: Steve Vinoski’s Blog

  3. Ulf, thanks for the great article. I would another condition, which although it doesn’t affect the coding style, it ensures “peace of mind” when writing applications in a language that uses lightweight processes: the guarantee that a process will never block the scheduler for a long time (e.g. for doing IO or calling native code). This is achieved in Erlang by the event-driven network IO core and the port-based native code interface. If those processes could call functions that block the emulator for more than a few milliseconds, performance could seriously suffer. When your language uses OS threads, this condition is a given, but when the VM does the scheduling, more restrictions are required on interactions with native code. All non-Erlang languages/libraries I know that supposedly implement Erlang-style concurrency lack this feature.

  4. Pingback: Erlang-China » [译]Erlang风格的并行

  5. Yariv – I was thinking about whether to include scheduling characteristics. It’s sort of related to Joe’s “processes behave the same way on all operating systems”. A classic problem with Java was that it didn’t specify whether scheduling should be cooperative or preemptive. Of course, if you want to build the language on top of a concurrency paradigm, you must nail those things down. I’m still undecided, though. I wanted to keep the listing reasonably short, and focus on what I think qualifies as “Erlang-style”. (:

  6. I always interpreted “Erlang-Style Concurrency” to mean: the actor model of concurrency. (As opposed to the shared-memory model of concurrency.)

  7. DAR – you’re not wrong, of course. The question is just how much one should read into “Erlang-style concurrency”. I thought that one should raise the bar a little bit, to highlight the fact that there is much more to Erlang and concurrency than just copying asynchronous message passing.

  8. I have been programming using Scheme48, a Scheme system that offers a Concurrent ML-like library [1]. A while ago, there was some excitement over Termite, another Scheme system that’s supposed to offer ‘Erlang-like’ concurrency. I never got around to using Termite, and I never figured out the difference between the two approach.

    I saw that you once made a posting to LtU on this topic as well, could you elaborate on the difference between the two?

    [1] http://mumble.net/~campbell/s48-refman/html/Concurrent-ML.html

  9. Dataflow technology implements processes and queues between the processes, very similar to Erlang. Dataflow is very well suited for large data processing. I’m not all that familiar with Erlang, but don’t believe it is as well suited for such processing. Dataflow is very much a data-oriented architecture, a new way of problem decomposition. Check out http://www.pervasivedatarush.com for more information.

  10. Pingback: Sp3w

  11. Duncan – I’ve never used Termite (nor CML), but Termite is one of the most ambitious attempt at implementing Erlang-style concurrency in another language (Glasgow Distributed Haskell is up there too). CML relies on synchronous message passing, and (too my knowledge) doesn’t implement distributed message passing or process supervision. Termite even shows a way of doing process migration – something that Erlang cannot do. I don’t want to make a qualitative comparison, since I’m not that familiar with either. The concurrency support in CML is designed to fit well into the ML language (to ML users, this is obviously a significant advantage). I don’t know what that buys you in Scheme, or whether Termite is a better fit with Scheme overall.

  12. Jim – You’re right in noting that Erlang may not be ideal for large data processing. The main problem may not be the concurrency model, but rather dynamic typing, which gives the compiler less information to work with. I think Haskell may have much better potential in this area, and some very interesting research has certainly been done http://www.haskell.org/haskellwiki/Research_papers/Parallelism_and_concurrency) and Simon Peyton-Jones’ slides, http://research.microsoft.com/~simonpj/papers/ndp/NdpSlides.pdf, give a pretty good idea of what’s possible when using Haskell for data parallelism.

    There is another take on dataflow programming, which perhaps would feel more natural in Erlang: Flow-based Programming (http://www.jpaulmorrison.com/cgi-bin/wiki.pl), but this seems different from what you’re talking about.

  13. Pingback: NServiceBus implements Erlang Concurrency

  14. I’m a researcher at the University of Kent, working with a parallel language called occam-pi (http://occam-pi.org/).

    It’s interesting to note that occam matches the first set of bullet points above (millions of processes, nanosecond communication/process creation times, compiler-enforced alias freedom, etc.), but not the second — communication is synchronous, there’s no process monitoring, and the last two don’t really apply because we have explicit channels rather than addressed messages.

    We usually describe occam-pi as a “process-oriented” language, but this suggests that you can be more specific. Perhaps both occam-pi and Erlang are in the larger class of process-oriented languages (which makes sense given how similar they are in many regards), and occam-pi uses a “channel-oriented” communication model whereas Erlang is “message-oriented”…

  15. Adam – I agree. As for the explicit channels, this allows Occam to avoid “accidental ordering” that can happen if you collect messages from multiple sources in a single mailbox. I think the term “channel-oriented” is good, but I’m not sure what would properly describe the Erlang approach of pattern-matching on a single message queue. (:

    I like Occam a lot. It was the second concurrency-oriented language I ever learned (the first was Concurrent Euclid).

  16. Pingback: Bert Lamb » links for 2008-02-11

  17. Pingback: A followup on Concurrency within Python — jessenoller.com

  18. The PDF version of your presentation crashed Foxit Reader moving from page 3 to 4.

  19. @Bob: I’m sorry. I don’t have Foxit Reader, and am not sufficiently knowledgeable about the PDF format to determine the source of the problem. It works in Acrobat Reader, as far as I can tell.

  20. @Stefan: Erlang uses preemptive scheduling, at least as far as the user is concerned. In the traditional, single-CPU implementation, scheduling was reduction-based and very predictable. In multicore Erlang multiple scheduler threads are premptively scheduled, meaning that scheduling becomes truly preemptive from the user perspective.

    Message passing does not force a task switch. The receiving process is marked as runnable, but the sending process may well be allowed to continue (unless it is scheduled out for other reasons).

  21. @web design company: This is true in a way, except for the fact that erlang processes are much more lightweight than POSIX processes. This has profound implications on how processes are used in system modeling.

  22. @codingthewheel: Thanks for the invite. I view Erlang and C as complementary, and I do agree that every programmer should learn C. I don’t necessarily agree that they should do their concurrency programming in it, though. (:

  23. Pingback: -= Linkage 2008.02.08 =-

  24. An intriguing extension to “copy semantics” is one of “ownership transfer semantics” through uniqueness proofs. With it a mutable object can be passed in a message as long as the system can prove the object is not referenced (directly or indirectly) by the sending actor after it has been sent. That means a) distributed actors can still copy as needed and b) mutation is race free.

    In a way, uniqueness is an optimization of immutable data. With immutable data, “updates” often require some copying (see Okasaki’s Purely Functional Data Structures for techniques to keep the copying to a minimum). With uniqueness typing, structures can be updated in place but no other part of the system can witness that update so nothing can tell the difference between a copy/update on immutable data and a destructive update on mutable data.

    The most famous example of uniqueness typing is in “Clean” which, like Haskell, is purely functional but unlike Haskell has a way to do in-place mutation without the challenges of an effects system or the IO monad. In fact, Clean’s equivalent of the IO monad is based on passing a unique “world” from function to function.

    Mercury, a logic language, has something similar called “linear typing” and like Clean, uses it to embed IO in an (otherwise) side effect free language.

    There’s some research into adding uniqueness types to Scala in http://lamp.epfl.ch/~phaller/uniquerefs/ and the paper alludes to its use in actors.

    That said, uniqueness isn’t always the right answer. For instance, a unique object by definition could not be sent by one actor to two different actors at the same time. That’s why uniqueness types and “ownership transfer semantics” can be a useful addition to but not a replacement for “copy semantics” and immutable data.

  25. Can Erlang actors/process save state? If so, how can actors be kept in a consistent state if an update is required to more than one? Maybe I’m thinking about their use incorrectly.

  26. Ulf, you’ve mentioned an interesting number – process creation time is about 5-10 microseconds for Erlang. May you provide such reference number for message passing? Btw, do you mean exactly process creation or process creation+destruction? Process creation time alone is usually not interesting in agent-oriented environment because of the dynamic nature of most systems. The same is true for message passing – send+receive time is the most interesting. I’ve heard about ‘cheap’ and ‘lightweight’ process creation/message passing in Erlang zillions times, but failed to find any real numbers. Thank you.

  27. @James, as you say, ownership transfer can be viewed as an extension of copying semantics. It and uniqueness typing are certainly exciting concepts.

  28. @Mark, the ‘state’ of an Erlang process is generally considered to be the current function and its input arguments. Strictly speaking, it’s rather the contents of the function call stack + the contents of the process dictionary + the message queue… Anyway, there is no built-in way to serialize a process and ‘save’ it, although it’s not too difficult to write a function that does this at least well enough for some practical uses.

    With ‘update’ I assume you mean code update? Processes tend to either pick up the newest code automatically the next time they enter that module, or can be written to switch to the new code only at a given signal. If processes need to be synchronized during code update (and this is not at all uncommon), OTP provides a set of conventions and control messages to achieve this using normal Erlang message passing.

    With some forethought, one can also use pattern matching at message reception to handle both new and old message formats and act accordingly. This can be particularly useful when processes reside on different nodes, making consistency more expensive. It does clutter the code somewhat, of course.

  29. @Dmitriy, simplifying a bit, message passing cost is on the order of a few (< 5) microseconds + data copying cost. Now, with manycore systems, things are not so clear-cut. As you say, the more interesting figure is the cost of message send + contect switching and reception. As it happens, this is also easier to measure, and the traditional ring benchmark does this pretty well. On my old 400 MHz single-core UltraSPARC at work, this would cost ca 4 usec for a very small message. Also, as you point out, process creation and teardown is the interesting bit in a dynamic system. As Erlang processes have separate heaps, terminating a process is basically a free() operation + possibly sending notifications to linked processes. In practice, this tends to mean that you can indeed spawn and kill a process in only a few microseconds on a modern machine. However, I haven't really attempted to obtain a precise number, and this is not entirely trivial either, since both process creation and process termination are asynchronous operations.

  30. I would also add “processes can be messaged across system boundaries as easily as local processes”. That’s the big one that prevents me from using Haskell for things I currently use Erlang for; Haskell is arguably better on a single machine, especially for performance, but when you want to have high-availability, you’re sort of stuck.

  31. @Jeremy, yes this was an important input requirement for Erlang – one shouldn’t have to know the location of a process in order to talk to it, but if one does need to know, the information should be available. I tried to incorporate that by noting that “Remote processes appear largely the same as local processes”. The question is whether it should be viewed as a necessary for “erlang-style concurrency”?

    One counter-argument might be that distribution wasn’t added to Erlang until 1993 – after the publication of the first Erlang book! (:

    However, the idea of Distributed Erlang was there from the start, and the semantics of process spawning and message passing were created with distribution in mind.

  32. Ulf, I note that you made no mention of Erlang’s insistence on single assignment. Do you think that this is required to achieve efficient copying semantics? Conversely, could a system that did not enforce single assignment be said to have Erlang-style concurrency?

  33. Ulf –

    thanks for providing the checklist as it helps to establish the bar for other languages.

    I would say that not all of the checklist are *fundamental* to have in the core language *if* the language has good extension facility.

    The fundamental ones, i.e. difficult if not impossible to have at module level, are the microthreads and asynchronous message passing – the rest can be built as extensions.

    I have just built the selective reception capability as a module for PLT Scheme on top of its microthreads – http://weblambda.blogspot.com/2009/09/erlang-style-programming-in-plt.html – distributed message and process monitoring can also be added in due time (and necessary effort of course).

    It is true for other languages that cannot blur the line between the language and the extensions, the language provider would have to work harder to provide most if not all of the checklist.

  34. @John, this very issue came up in the discussion about Tony Arcieri’s blog post “Single Assignment Myths” (http://tonyarcieri.org/articles/2008/09/30/single-assignment-myths which is unavailable as I’m writing this – hopefully not permanently). Reia is a Ruby-like language that runs on top of the Erlang VM. It supports Erlang-style concurrency, but does not have single assignment. It’s perfectly reasonable to have destructive assignment combined with copying message passing, and I don’t see that it would have to slow anything down.

  35. @YC, indeed, some languages have superb extension facilities (LISP and Haskell come to mind) and can provide almost anything as libraries on top of the core language. On the other hand, one really wants all libraries to use a common concurrency model, for example (or at the very least, compatible concurrency models), so even if these things are technically implemented as libraries, they really have a tendency to become part of an extended language definition, at least.

    There are several concurrency models implemented on top of the JVM, for example, but their use is limited, partly because the vast majority of Java libraries assume the shared-object concurrency model provided by the core language.

    Similarly, I’ve witnessed C++ projects get into trouble because they changed the concurrency model and never got around to rewriting their old code, and ended up having to develop incompatible implementations of the same features for both models.

  36. Pingback: Daily Links #98 | CloudKnow

  37. @Kevin: It is very similar to CSP, although selective receive with implicit buffering would have to be modeled on top. Extending an existing language easily becomes problematic, as there are often existing assumptions that are at odds with the share-nothing lightweight concurrency aspects – not to mention the error handling.

  38. Pingback: Quora

  39. Pingback: Erlang style concurrency in the D programming language | Ask Programming & Technology

Leave a Reply

Your email address will not be published. Required fields are marked *