Saturday, June 2, 2012

lparallel - A Parallel Programming Library for Common Lisp

Parallel programming is hard, and as CPUs are getting ever-faster, I usually tend to find optimizing a single thread of control to be less risky than dealing with threads, locks and synchronized data structures. Recently, though, I had to deal with a reporting function which I started from a Hunchentoot handler and that was running long enough to time out the client. I needed to somehow put running the function into the background, and as I heard good things about the relatively new lparallel library, I thought I could give it a try.

lparallel implements an abstraction layer for parallel programming in Common Lisp. In addition to relatively low-level concepts like tasks and channels, it implements mid-level promises and futures as well as high-level parallel mapping, binding and reducing functionality. Futures looked like a suitable mechanism to solve my problem at hand.

The non-parallel version of my HTTP request handler looked something like this:

(hunchentoot:define-easy-handler (something-that-takes-long :uri "/sttl")
    (parameter)
  (if (eql (hunchentoot:request-method*) :post)
      (compute-and-display-response-page parameter)
      (show-job-parameter-form)))
The compute-and-display-response-page function does what the name suggests. As usual with Hunchentoot request handlers, it returns the page to display to the user. Likewise, the show-job-parameter-form returns a HTML form that collects the parameter. Now, compute-and-display-response-page potentially takes long, so instead of just delaying the response to the HTTP request until the function returns, I want to have it execute in the background and respond to the request with a "job has been started" message. The client is then supposed to poll in regular intervals. Once the background job completes, the page that the function has generated is returned to the client.

Using a future, I came up with something like this:

(hunchentoot:define-easy-handler (something-that-takes-long :uri "/sttl")
    (parameter)

  ;; Get or start Hunchentoot session context
  (hunchentoot:start-session)

  (let ((job-running (hunchentoot:session-value :job)))
    (cond

      ;; Previously started job has finished
      ((and job-running
            (lparallel:fulfilledp job-running))
       (hunchentoot:delete-session-value :job)
       (lparallel:force job-running))

      ;; Previously started job still running
      (job-running
       "Previous job still running")

      ;; Start new job
      ((eq (hunchentoot:request-method*) :post)
       (setf (hunchentoot:session-value :job)
             (lparallel:future
               (compute-and-display-response-page parameter)))
       "The job has been started")

      ;; Display job parameter form
      (t
       (show-job-parameter-form)))))
Hunchentoot's session mechanism is used to make the future accessible to subsequent requests. The future is placed in a session value; the completion of the background calculation is determined by calling lparallel:fulfilledp. Once the computation is finished, the return value of the compute-and-display-response-page is determined using lparallel:force.

This is mostly it, and I find parallel programming in this case easy to understand and reason about. Some additional things are worth mentioning:

  • To use lparallel, one has to create a kernel, which constitutes the worker thread pool in which futures and other lparallel tasks are executed. Multiple kernels can coexist, the kernel to use is determined by the lparallel:*kernel* special variable. In my application, I am creating a kernel with one worker thread. This provides me with automatic request serialization so that only one long-running function invocation exists at any time, without the need to do any locking or explicit resource management. I like that a lot.
  • By default, errors that occur in a worker thread cause the debugger to be entered. This can be changed by setting lparallel:*debug-tasks-p* to nil. Then, errors will be caught and signalled to the caller of lparallel:force, which in my case was the right thing to do (i.e. if the background process triggers an error, I want the backtrace to be sent to the client.
  • One thing I find slightly bothersome is the reverse order of the cond clauses in the request handler - During the lifetime of a typical session, the last clause of the cond is true first, then the second to last and so on. This is a pattern that I'm seeing quite often when writing http request handlers. Maybe a cond-reverse macro would be suitable in this case.

lparallel has more to offer of course, and I have only used a very small part of it. So far, I have found it to be very well thought out, documented appropriately and well maintained. If you need parallel programming or even just easy background execution in your Common Lisp programs, I recommend looking at it.

Share:

2 comments:

  1. Great read! Got me interested to learn the library.

    ReplyDelete
  2. Hello and thanks for the recommendation. You've given a nice example with a clear explanation to accompany it.

    I would add that *debug-tasks-p* was originally intended for use during development, as it's a global flag with global consequences. You can localize the decision to transfer errors by wrapping the creation of the future with

    (task-handler-bind ((error #'invoke-transfer-error)) ...)

    There are no hard rules, of course, and in some cases using *debug-tasks-p* may be best/easiest. It is probably best for an introduction like this one, in any case.

    Incidentally if it weren't for the time out issue, your code would be expected to work after the lparallel:future call is removed, without further changes. A promise is a generalization of a Lisp value--non-promises behave like fulfilled promises--and in a way you've simply made your code more general by introducing a promise.

    ReplyDelete