PCall

PCall, or parallel call, is a Common Lisp library intended to simplify 'result-oriented' parallelism. It uses a thread pool to concurrently run small computations without spawning a new thread. This makes it possible to exploit multiple cores without much extra fuss.

Note that there exists a fork of PCall, Eager Future, which is (at this time) more actively developed.

News
License
Download and installation
Support and mailing lists
Quickstart
Reference

News

03-09-2009: Version 0.3: Release some changes that have been sitting in the repository for months now. select-one and worker environments have been added.

26-01-2009: Version 0.2: Since there suddenly is some increased attention for the library, I'm releasing the current code as 0.2. Still beta-ish, but seems to work. This version adds with-local-thread-pool.

06-06-2008: Version 0.1: The first release. Should be considered beta. Any testing and feedback is appreciated.

05-06-2008: Added a background article with some related thoughts.

License

PCall is released under a zlib-like license, which approximately means you can use the code in whatever way you like, except for passing it off as your own or releasing a modified version without indication that it is not the original. See the LICENSE file in the distribution.

Download and installation

PCall depends on bordeaux-threads, and on fiveam for the test suite.

The latest release of PCall can be downloaded from http://marijn.haverbeke.nl/pcall/pcall.tgz, or installed with asdf-install.

A darcs repository with the most recent changes can be checked out with:

> darcs get http://marijn.haverbeke.nl/pcall/pcall

Or you can look at the repository through darcsweb.

Support and mailing lists

There is a Google group for any discussion relating this project. Also feel free to drop me an e-mail directly: Marijn Haverbeke.

Quickstart

PCall is a rather simple library. There are only three basic concepts that you have to come to grips with:

The thread pool, which mostly manages itself, but whose existence you should be aware of. It starts inactive, but as soon as a task is created, it is initialised. See thread-pool-size and finish-tasks for ways in which it can be managed.
Creating tasks, done through pcall and pexec. This creates a computation that may be parallelised, and enqueues it. When given the chance, the threads in the thread pool will take such a task, execute it, and store its result.
Joining tasks (join), which extracts the value from a computation. This blocks when the task is still running. If it has not yet started, the joining thread will claim and execute the task itself.

Imagine that we have this wonderful algorithm for computing (once again) Fibonnaci numbers:

(defun fib (n)
  (if (> n 2)
      (+ (fib (- n 1)) (fib (- n 2)))
      1))

(time (fib 40))

Depending on your machine, this might take some 2 to 10 seconds. We don't have that kind of patience. You can see that this algorithm is entirely optimal, so our only option, it seems, is to use more hardware ― or make better use of the hardware we have:

(time (let ((f39 (pexec (fib 39)))
            (f38 (pexec (fib 38))))
        (+ (join f39) (join f38))))

On my 2-core machine, that speeds things up by about a third ― which makes sense, since computing fib(39) is about twice as much work as computing fib(38). A nicer way to write the same thing is:

(time (plet ((f39 (fib 39))
             (f38 (fib 38)))
        (+ f39 f38)))

plet takes care of the wrapping and joining in cases like this. Why do we need the let anyway? You could try this:

(time (+ (join (pexec (fib 39))) (join (pexec (fib 38)))))

... but that won't buy you anything. The tasks have to both be created before you join the first one, or the second task will not exist when the first one runs, and thus won't be able to run concurrently with it.

You might be tempted to write something like this:

(defun pfib (n)
  (if (> n 2)
      (plet ((a (pfib (- n 1)))
             (b (pfib (- n 2))))
        (+ a b))
      1))

... but don't. There is some overhead associated with creating and executing tasks, and for a function like naive-fibonacci, which recurses a zillion times even for small inputs, this will radically slow your algorithm down. A parallel mapping function, as shown below, works great for mapping a relatively heavy function over a list of limited length, but is no use for mapping 1+ over a million elements.

(defun pmapcar (f list)
  (let ((result (mapcar (lambda (n) (pexec (funcall f n))) list)))
    (map-into result #'join result)))

(defvar *numbers* (loop :for i :from 0 :below 30 :collect i))
(time (mapcar #'fib i))
(time (pmapcar #'fib i))

Note that joining tasks is not required. When you do not care about the result of a computation, you can just spawn the task and leave it at that.

As a final note, PCall can also be used when a program is not CPU-bound, but needs to do some tasks that are hampered by other bottlenecks (network latency, disk speed). If they can be executed in parallel, you can have the thread pool run them. In the following example, the second version runs three times faster on my machine:

(defvar *urls* '("http://marijn.haverbeke.nl/pcall" "http://common-lisp.net"
                 "http://eloquentjavascript.net" "http://xkcd.com"))

(time (mapc 'drakma:http-request *urls*))
(time (mapc 'join (mapcar (lambda (url) (pexec (drakma:http-request url))) *urls*)))

In some applications, doing multiple database queries at the same time could really help. You might need to increase the size of the thread pool in such a situation, since some threads will be sitting idle, waiting on a socket.

Reference

function pcall (thunk)
→ task

Create a task that will call the given argumentless function.

macro pexec (&body body)
→ task

A shorthand for (pcall (lambda () ...)).

macro plet ((bindings) &body body)

Follows the behaviour of let, but wraps every bound value into a pexec form, and automatically adds join calls around uses of the bound variables.

function join (task)
→ result

Waits for the given task to finish, and then returns any values the task produced. When executing the task raised an uncaught error, this error will be raised when joining the task. (Note that this re-raising might cause problems with condition handlers, which might not be active in the worker threads. If you are using handlers, take extra care, and look at set-worker-environment.) When, at the moment join is called, the task has not been assigned to any thread, the joining thread will execute the task itself. (Note that this makes the dynamic environment in which the task runs unpredictable.) A task may be joined multiple times. Subsequent joins will again return the values, without re-executing the task.

function select-one (&rest tasks)

Waits until at least one of the given tasks is finished and then returns that task.

function done-p (task)
→ boolean

Tests whether a task has been executed.

function thread-pool-size ()

Returns the current size of the thread pool. Also supports setf to change this size. The default value is 3.

function set-worker-environment (wrapper)

This can be used to make dynamic variables or bound handlers available in the worker threads. wrapper should be either nil, for no wrapping, or a function that, given a function argument, calls its argument in the new environment. Works best with local thead pools.

function finish-tasks ()

Takes the current threads out of the pool, waits for the task queue to empty, and then for the threads to finish executing any tasks they might be busy with. This is intended to be called when shutting down ― killing the threads might cause some tasks to be aborted, which could result in data loss. If you join every task you create, this should not be necessary. Note that if you call this function while other threads are still creating tasks, it might never return.

macro with-local-thread-pool (&key size on-unwind worker-environment)

Run body with a fresh thread pool. If on-unwind is :wait (the default), the form will wait for all local tasks to finish before returning. If it is :leave, it will return while threads are still working. Given :stop or :destroy, the threads will be stopped at the end of the body. With :stop, they will first finish their current task (if any), with :destroy, they will be brutally destroyed and might leak resources, leave stuff in inconsistent state, etcetera. worker-environment can be used to give the workers in the local pool a specific dynamic environment.