"Spin Buffers": DO NOT USE

Recently, my attention was directed to an article in June’s issue of Doctor Dobb’s Journal; it describes an implementation of thread-safe FIFO queues, “spin buffers”, which supposedly yields better throughput than standard alternatives like java.util.concurrent.ConcurrentLinkedQueue. However, I didn’t get far into the article before I read some text which raised a big red flag:

One approach [to addressing synchronization overhead] is to minimize synchronization as much as possible. But no matter how small you make the synchronization functionality, the problem remains and is amplified when the data/resource exchange happens a thousand times a second. Consequently, you should consider using Spin buffers if you are writing high-performance applications because they eliminate the need for synchronization. They don’t even need to employ low-level atomic instructions (such as Compare & Swap) found in advanced processors.

As described, this is a profound “free lunch violation” in the same class as perpetual motion machines, free energy devices, and the water-powered car: you need some kind of synchronization (memory barriers and atomic operations, if not also locks) when communicating between threads. Without it, the compiler and CPU are free to make all sorts of alterations to your code and its memory effects which can cause failures in multithreaded situations.

Synchronization isn’t just about keeping two threads from touching the same data at once; it’s also the only guarantee that your code will actually work the way you wrote it outside the confines of a single thread.

Looking through the source code (available from the issue’s downloadable source code archive as spin.txt and spin.zip) confirms my worst suspicions: there are no memory barriers involved whatsoever. Sadly, it’s quite easy to fool yourself into believing that a piece of code works in a concurrent setting, because concurrency bugs are notoriously difficult to test for. Even worse, the author tested on a single extremely forgiving hardware configuration (a single hyperthreaded x86 CPU), and his test program checked nothing besides counting the number of objects read from the queue.

So, please, don’t use this code, especially not in anything important. It’s fast because it throws correctness to the winds. Stick with code like ConcurrentLinkedQueue which has been extensively peer-reviewed.

Update: A number of readers have written me, repeating the author’s assertion that there aren’t any problems with a single reader and a single writer, because synchronization is only necessary to prevent conflicting updates. That simply isn’t true on modern systems: unless you employ proper synchronization, there are subtle problems which can arise when two threads simply communicate. The Java Memory Model FAQ gives a good overview of the issues involved.

Update: Bill Thies points out that, while you do need memory barriers for proper synchronization (which the article’s author failed to use), neither atomic operations nor locks are necessary for writing a single-reader/single-writer queue if you do it right.

hoodwink.d enhanced