Ruby STM: Round Two

Progress so Far

Since Ruby STM: First Round, I’ve put in a good weekend of hacking. By Monday morning, I already had a working STM.retry, and once I got home from work I spent most of the rest of last night getting STM#or_else operational. Most of that work involved cleaning up the metaprogramming mess I’d made and adding write checkpointing to the transaction class. Contra Jones et al, fully nested transactions don’t seem to be necessary to implement orElse. At this point I believe the sample script from round one should work, but I haven’t tried it yet.

What’s Left

So, here’s what’s left to do before I consider my Ruby STM prototype to be feature-complete:
  • STM::Variable – a simple transactional object holding a single value (probably accessed through a value accessor and maybe a zero-argument variant of [] as some sort of dereference-like sugar)
  • blocking STM.retry – currently, retry polls continuously rather than blocking
  • optimistic concurrency control for reads – this means, per Ennals, no longer acquiring locks when reading, then adding object versioning and some sort of reaper mechanism to clean up the resulting invalid transactions. It’s, uh, better than I made it sound.
  • more transactional container types – besides STM::Struct, I’d also like to have transactional versions of at least Array, Hash, Set, and possibly String
  • a nice API for timeouts – an implementation-aware timeout facility seems to be necessary because transactions can block one another (a timeout needs to be able to interrupt a lock attempt), but even if the naive hand-rolled version worked well it’s not the sort of thing that people should have to keep recreating:
require 'stm'

def timeout( duration, &block )
  timed_out = STM::Variable.new
  Thread.new { sleep duration ; timed_out[] = true }
  STM.new( &block ).or_else {
    STM.retry unless timed_out[]
    raise TimeoutError
  }.call
end

timeout( 3 ) do
  # some (sub-)transaction which should time out
  # if it takes longer than 3 seconds
end

The retry until Idiom

It seems to me that it could be clearer to write STM.retry until condition instead of STM.retry unless condition; since STM.retry unwinds the stack when it’s called, the two are essentially equivalent. until might involve a little extra overhead, but that overhead would most likely be swamped by the transaction machinery.

try ... or_else

The current arrangement for using or_else involves explicitly creating an STM object and then another for the or_else, ultimately invoking the blocks indirectly through STM#call. It might be nicer to offer an STM.try {...} which returns an object from one of two classes—say Retried and Succeeded. The former would be a singleton with an or_else {...} that yielded to its own block and returned the result. The latter would have an or_else {...} that ignored its own block and simply returned the (stored) result of the first block.

To wit:

STM.try {
  ...
}.or_else {
  ...
}

...would be a more efficient equivalent to:

STM.new {
  ...
}.or_else {
  ...
}.call

...just as:

STM.atomically {
  ...
}

...is already a more efficient equivalent to:

STM.new {
  ...
}.call

Future Direction

At the moment, I’m writing only for clarity and correctness rather than speed, but once the Ruby prototype is complete and well-tested, I plan on moving most of the machinery into a C extension. My hope is that STM (in addition to being easier to use) will become a more efficient option for managing concurrency in Ruby than the usual Mutex or Thread.critical hacks.

hoodwink.d enhanced