POPL impressions

Back home from my first POPL conference, I’m trying to collect my thoughts.

Overall, I was impressed with the high quality of the papers and presentations. While it’s not immediately obvious why industry representatives ought to attend POPL (which is, after all, a very academic conference), I’d like to suggest that it’s worth your while. You can expand your horizons and your network, and improve your intuition about where the programming community appears to be headed. Especially as we are experiencing a major paradigm shift, I’d expect that the first indications of trouble (or hope) might well come at conferences like POPL.

The panel discussion compensated somewhat for the lack of Erlang representation in the audience and on the podium, as message passing and Erlang received favourable mention.

Much of the discussion hovered around parallelism, of course, but Martin Rinard (represented by Arvin) offered the somewhat provocative view that today’s software is over-engineered and has too few errors in it! I had touched on this in my DAMP presentation the day before, as I mentioned that correctness is elusive in our commercial products, since the specifications themselves are often buggy (a claim also made by Mats Cronqvist at the Erlang Workshop 2004: “Most of the errors were not coding errors, but a working implementation of the wrong thing.”

My main reason for attending was the DAMP workshop, where I’d been asked to give a tutorial about Erlang programming for Multicore (presentation here). I’d call it an experience report instead, since (a) Erlang’s approach to multicore programming is that it’s supposed to be transparent, and (b) I’d be embarrassed to try to give a programming tutorial in front of people like Simon Marlow, Satnam Singh and Xavier Leroy…

I initially decided to put much emphasis on testing parallel programs using QuickCheck, and this also became the reason why I could go at all. The ProTest project sponsored the trip, since it gave us a chance to present some of its latest developments. I also think this is a very valid angle. Testing and debugging in the face of non-determinism is a real challenge. While preparing the presentation, we had great fun trying out the newest QuickCheck stuff and beating up some of my old discarded code experiments. We need to continue the work, and put some more focus on the semantics of the API. I’m pretty convinced that the code cannot be made bug-free on multicore without revisiting the API (which mimics Erlang’s built-in functions, by the way).

The DAMP workshop was great. I wish the Erlang maintainers could have been there. I enjoyed the talks and the discussions. Simon Marlow’s talk on Haskell’s concurrency primitives was very enlightening, and should be read by all who are interested in STM. John Reppy’s presentation on Manticore (haven’t found it on line, but this one appears similar) was also intriguing, perhaps especially the idea of an abstract low-level language for parallel execution. It’s possible that we’ll need something like that if we are to tackle a diversity of many-core architectures. I also enjoyed Sven-Bodo Scholz’s talk on Single-Assignment C (SaC), and couldn’t help thinking about whether it would be possible to integrate that into Erlang. (:

As always with these conference, one of the greatest rewards is the socializing in the evenings and during the breaks. I will try to spend some time studying Barry Jay’s pattern calculus and the Bondi language. There seems to be some kinship here with the Erlang language…

Erlang Programming for Multicore

I was invited to give a tutorial at the Declarative Aspects of Multicore Programming (DAMP) workshop, in conjunction with POPL 2009. This was my first POPL, and I want to get back to my impressions in another post, but for now, here is the presentation.

The presentation was developed as part of the ProTest project.

Sorry for being so short, but my laptop power adapter broke, and I’m writing this on a hideously expensive computer in the hotel business center.

Windows tricks for erlangers

Not that I spend much time compiling erlang code in Windows (unless I’m using cygwin or coLinux), but the issue does pop up from time to time. Here are a few minor tricks that can help a bit.

Starting Erlang in the current directory
For a unix user, it’s of course odd that this should be a problem.

John Hughes used a nice trick that’s so simple that I slap myself for not having thought of it: Right-click on a .beam file, select “Open With…” and locate werl.exe. Now you can open an erlang shell in the current working directory by double-clicking a .beam file.

Modifying the Windows context menu
So what if you don’t have any .beam files, and you’re trying to get to an erlang shell
in order to create some?

I did some googling and found this tutorial on youtube on how to add custom entries to the context menu.

It seemed simple enough, but Vista still served me a few hours of utter confusion since it took what I inserted, and then copied it to another place in the registry (without letting me know)… but only the first time. My changes and additions were simply ignored. The solution? Search the registry for the key you inserted, and you’ll figure out where Vista wants it to be, then make your changes there.

What I’ve experimented with so far is to add under Computer\HKEY_CLASSES_ROOT\ErlangSource\shell\ the following entries:


Compile with erlc
- Command = "C:\Program Files\erl5.6.3\bin\erlc.exe" "%1"
Make all
- Command = "C:\Program Files\erl5.6.3\bin\werl.exe" "-make"
Erlang shell
- Command = "C:\Program Files\erl5.6.3\bin\werl.exe" "%1"

A problem with running erlc this way is that the window is destroyed immediately upon completion. I’ve poked around a bit for good workarounds. One option is of course to write an erlang function that compiles the file, then either sleeps a short while, or waits for input.

coLinux – the devil’s in the details…

Pleased with my initial success with coLinux, I set out to add the finishing touches.

I’ve gone back and forth on the network configuration, trying e.g. to carefully follow Bebbo’s config example. It didn’t work for me. There seem to be issues running TAP on Vista, and bridging with my Wireless LAN interface caused the WLAN connection to start acting up. I also tried the Microsoft Loopback adapter, but no joy. I finally figured out that just using slirp with the following (recommended) setting:

(upate: I made a last minute change that proved wrong. This is what works.)
eth0=slirp
eth0=slirp,,tcp:5901:5900

worked just fine. The only problem was that I thought I had to set the DISPLAY variable to the (dynamically configured) IP address of my laptop. But looking at the auto-generated setting for eth0 in /etc/network/interfaces, I realized that slirp had assigned an IP address for my laptop, as seen from the coLinux end:

iface eth0 inet static
address 10.0.2.15
broadcast 10.0.2.255
netmask 255.255.255.0
gateway 10.0.2.2

The “gateway” address was the one I needed, so I could just hardcode the following in my .bashrc:

export DISPLAY=10.0.2.2:0.0

Xming is said to be a lighter and better X-Windows server than Cygwin X, so I set out to change. No major issues, except that I had a very hard time figuring out how to get it do display Swedish characters. I’d suffered this in the coLinux console too, but that was ok, since I intended to jump into X as soon as possible.

Fixing the language support in the console was … uhm, not so easy, but I’ll attribute that to my being sorely out of practice on linux admin chores. Googling indicated that the cure was

apt-get install locales
dpkg-reconfigure locales
apt-get install console-data
dpkg-reconfigure console-data

Close, but no cigar. I also had to do
apt-get install kbd

Now, I had Swedish characters in the console, but not in X. Finally, I found this post (in Swedish), suggesting that it’s a bug in Xming’s definition of the Swedish layout. Specifying Finnish layout instead solved the problem.

The command line for Xming now looks like this:

exec0="c:\Program Files\Xming\Xming.exe -multiwindow -clipboard -silent-dup-error -xkbmodel pc105 -xkblayout fi"

I also switched to PulseAudio. Following the instructions on the coLinux wiki worked for me.

I also tried setting the environment variable COLINUX_NO_SMP_WORKAROUND=Y, to get coLinux to use both cores. I was disappointed to see that erl -smp still only saw one CPU. Apparently, coLinux makes internal use of SMP, but makes it look like a single-core system for the linux applications. My win32 version of Erlang does see both CPUs, though.

Also, it seems as if erlang under coLinux suffers from the same problem as under Win32 – erlang:now() doesn’t have sufficient resolution:

Eshell V5.6.4 (abort with ^G)
1> [erlang:now() || _ <- lists:seq(1,10)].
[{1224,77892,229402},
{1224,77892,229403},
{1224,77892,229404},
{1224,77892,229405},
{1224,77892,229406},
{1224,77892,229407},
{1224,77892,229408},
{1224,77892,229409},
{1224,77892,229410},
{1224,77892,229411}]

Note how it increments by 1 usec each time? On my other laptop running Ubuntu, the diff is ca 12 usec between each (by definition, erlang:now() always steps up at least 1 usec). I wish I could say that my Vista laptop is ~10x faster, but that’s not it. Apparently, coLinux doesn’t offer a gethrtime(). gethrvtime() also seems to be missing, and configure reports clock_gettime() as “not stable”. No disaster, perhaps, but a good erlang:now() is often useful.

coLinux on top of Vista

Since I’m lugging around a dual-core Vista laptop, I thought I’d try to make it more useful for hacking.

I’ve tried using the win32 version of xemacs, Eclipse (crashed due to some lock violation), other editors like TotalEdit, et al, but it all feels very clunky compared to a unix environment. Cygwin is ok, but has its issues as well. For one thing, running erlang from within a cygwin shell messes up command-line editing.

Enter Cooperative Linux, or coLinux – a port of the Linux kernel that allows it to run natively alongside another OS. I played around a bit with various pre-built distributions, and finally settled for the bare-bones Debian Etch root file system that comes as an option with the basic coLinux install. I use my existing Cygwin X server, and resized the root partition up to 16 8 GB.

So far, so good. I’ve used apt-get to install the necessary utilities, and was able to build Erlang/OTP R12B-4 without problems. The system feels pretty much like a native linux installation.

coLinux on Vista

The thing I have yet to figure out is how to share a file system between coLinux and Vista. I thought of using a Cygwin NFS daemon, but I don’t seem to have enough access privileges on my work laptop to pull that off. But overall, it feels extremely promising.

Contribution summary page

I once posted a question to the erlang-questions list in order to find out which of my Open Source contributions were actually used. I was hoping to remove some of them.

This didn’t seem to be doable, and I was instead asked to put together a summary of my different contributions, together with their status and what I thought I had learned from them.

Well, that was a number of years ago, but I finally put together something along those lines: My Erlang Projects (available from the right side bar). At least it was helpful to me – I guess that’s the main value of most blog activities anyway. (:

Indentation-sensitive Erlang 3

So, maybe I’m thick enough not to realise in advance when I’m barking up the wrong tree, or perhaps I just like to follow things through to the bitter end, just to know what exactly didn’t work…?

I’ve gone one more round with my indentation-sensitive Erlang scanner/parser. For a while, it looked like I was winning, but eventually, I had to admit defeat in the face of funs and record constructors.

Again, the approach I thought I’d take was to add indentation tokens to the normal scanner, and then add rule clauses to the normal grammar to make it understand both indentation-sensitive code, and all the code you’re accustomed to writing. The normal parser is an LR(1) grammar, which means that the additions have to be quite symmetrical in order to work.

This seemed to work in just about all the places that mattered, but I was stumped by funs and record constructors. The main with funs was that you could no longer write them in the conventional way (failing one of my preconditions), and record constructors usually cannot be written without including a few commans in a way that was …uhm, less than obvious.

I include the test module, which pretty much illustrates what works, and what doesn’t. At this point, I doubt that I can achieve better than a 90% solution, which is probably just good enough to eventually drive people crazy.

It’s been fun, though. I’m have no particular craving for indentation-sensitive syntax myself, so I thought I’d tackle this as a learning experience. Not for the first time, I feel like concluding that retrofitting concepts onto existing programming languages is usually very difficult to do well.

The code is at http://svn.ulf.wiger.net/indent/branches/0.3

-module(test).

-record(r,{a,b}).

-compile(export_all).
-scan(indentation).

f(X) ->
    X+2


g(X) ->
    X+4
.          % dot can either be 'outdented' or
           % terminating last line

g1(X) ->
    X+4.


h(X) ->
    Y = case X of
          a ->
            {a}
          b ->
            {b}
    end          % end is optional
    Y

h1(X) ->
    Y = case X of
          a ->
            {a}
          b ->
            {b}
    Y

h2(X) ->
    case X of
        a ->
            1
        b ->
            2

h3(X) ->
    case X of
        a -> 1;  % must use semicolon here :-(
        b -> 2
             

i(a) -> a
i(b) -> b


j(A,B) ->
    {A
    B}

j1(A,B,C) ->
    {A
     B     % indents must be greater than 1;
     C}    % else they count as aligned

k() ->
    S = "a string "
        "spanning multiple "
        "lines"
    {S}


l0() ->
    fun(1) -> 1; (2) -> 2 end.

l1() ->
    fun            % must break line here and indent.  :-(
      (1) ->
            1
      (2) ->
            2

%%% This, alas, doesn't work :-(
%%%
%%% l2() ->
%%%     fun(1) ->
%%%             1;
%%%        (2) ->
%%%             2
%%%     end

m() ->
    #r{a = 1, b = 2}.

m1() ->
    R = #r{a = 1}
    R#r.a

%%% indentation syntax works poorly for record assignment.
%%% Both commas are needed :-(
m2() ->
    R = #r{a = 1,
           b = 2},
    R

test() ->
    2 = f(0),
    4 = f(2),
    4 = g(0),
    4 = g1(0),
    8 = g(4),
    {a} = h(a),
    {b} = h(b),
    {a} = h1(a),
    {b} = h1(b),
    1 = h2(a),
    2 = h2(b),
    1 = h3(a),
    2 = h3(b),
    a = i(a),
    b = i(b),
    {a,b} = j(a,b),
    {a,b,c} = j1(a,b,c),
    {"a string spanning multiple lines"} = k(),
    F0 = l0(), 1 = F0(1), 2 = F0(2),
    F1 = l1(), 1 = F1(1), 2 = F1(2),
    {r,1,2} = m(),
    1 = m1(),
    ok.

Indentation-sensitive Erlang 2

So, I considered the comments (thank you, all), and thought I’d have another go at making the ending ‘dot’ optional.

I decided to introduce another token, ‘GAP’, to denote an empty line. Most likely, the scanner, in its current state, will not be able to handle empty lines with white space in them, etc, and the code is starting to look a bit confused. Oh well…

The toplevel rule for a function now becomes:


form -> function dot : '$1'.
form -> function 'GAP' : '$1'.

and the rule for alternative function clauses is as before:

function_clauses -> function_clause : ['$1'].
function_clauses ->
   function_clause ';' function_clauses : ['$1'|'$3'].
function_clauses -> 
  function_clause 'OUT' : ['$1'].
function_clauses ->
  function_clause 'END' function_clauses : ['$1'|'$3'].

The first two rules are the original rules for indentation-insensitive code. The last two are for the indentation tokens. The ‘OUT’ token is for symmetry, to match the ‘IN’ token after the arrow in function_body. Remember that indentation tokens are normalized in the scanner.

The test program now looks like this:

-module(test).

-compile(export_all).
-scan(indentation).

f(X) ->
    X+2

g(X) ->
    X+4
.

h(X) ->
    Y = case X of
          a ->
            {a}
          b ->
            {b}
    end
    Y

i(a) -> a
i(b) -> b

test() ->
    2 = f(0),
    4 = f(2),
    4 = g(0),
    8 = g(4),
    {a} = h(a),
    {b} = h(b),
    a = i(a),
    b = i(b),
    ok.

A little bit better. The ‘end’ tokens are still needed, though. One thing at a time…

Indentation-sensitive Erlang

I was inspired by Chris Okasaki’s blog article about mandatory indentation. Not that indentation could be made mandatory in Erlang – it would break way too much code – but the idea of inserting indentation tokens in the token stream did seem simple enough, that I at least had to try it.

I made a copy of erl_scan.erl (named erl_scan_ind.erl) and made it figure out indentation tokens. Then I added to the Erlang grammar in erl_parse.yrl. All the old rules remain, but some new rules were added to account for indentation tokens. For example:


clause_body -> '->' exprs: '$2'

becomes:


clause_body -> '->' exprs: '$2'
clause_body -> '->' 'IN' exprs 'OUT' : '$3'.

The indentation tokens I used were:

  • ‘IN’ for indent
  • ‘OUT’ for outdent (one for each matching indent)
  • ‘ALIGN’ for when the next line keeps the same indentation
  • ‘END’ when indentation goes back to zero

So a sequence of expressions could be written without commas, based on the following rule:


exprs -> expr : ['$1'].
exprs -> expr ',' exprs : ['$1' | '$3'].
exprs -> expr 'ALIGN' exprs : ['$1' | '$3'].

My test program, which I was eventually able to compile, looked like this:

-module(test).

-compile(export_all).
-scan(indentation).

f(X) ->
    X+2
.

g(X) ->
    X+4
.

h(X) ->
    Y = case X of
          a ->
            {a}
          b ->
            {b}
    end
    Y
.

(Note especially that the final ‘end’ must be aligned with the Y, rather than the ‘case’. Perhaps this could be avoided…?)

The ending dots don’t have to be on their own line. Getting rid of them was too hard for me, since ‘dot’ is the end token for the Erlang grammar.

The -scan(indentation). attribute tells epp to switch to the indentation-sensitive scanner. -scan(normal). tells it to switch to the normal scanner.

I soon realised that I had to normalize the indentation tokens at the end of the scan. A few oddities were introduced, like inserting an ‘OUT’ token before each dot (and corresponding additions to the grammar). But for the most part, the additions to the grammar seemed fairly logical. The parser seems to handle all the old code, even though I should perhaps try recompiling the whole OTP source tree before making such a claim.

The code (based on OTP R12B-1) can be found at http://svn.ulf.wiger.net/indent/trunk

The grammar is still contaminated with some debug statements, which allowed me to print the productions as they were identified. They should of course be removed eventually.

I’m not convinced that this is really a good idea, but at least I had fun doing it.