Still Better Than Mutexes

If you happened to watch my lightning talk from CppCon then you heard me talk about how there was a case where one could still get deadlock when using transactional memory. Specifically one could do something along the lines of1

a = 0
b = 0

//thread #1:
atomic
{
   if (a != 1)
      retry(); //sleep until a is changed by another thread
   b = 1;
}

//thread #2:
atomic
{
   if (b != 1)
      retry(); //sleep until b is changed by another thread
   a = 1;
}

ending up with a retry deadlock (neither thread will be able to make progress since each one will be waiting for the other to make a change). See section V in this paper2 for another example. It’s relatively easy to spot the issue in the above example, but in real code the issue will probably be buried in a bunch of other code.

That being said, I’ve never encountered this issue in practice. This doesn’t mean that it’s impossible to get into this state, just that it seems exceedingly rare to write code that could get you into this state. In my experience there is rarely a need to have interleaved retry conditions as above; most of the time if there are threads that retry on some condition there is some thread that will set the condition true unconditionally given enough time. This might involve a chain of retry conditions through various threads forming a graph of conditions, but I don’t think I’ve encountered a retry condition graph that wasn’t acyclic (you’d need a cycle in the graph to get into the above deadlock situation)3. So, at least for us, cycles have never arisen in our retry condition graph.

There’s also a subtle, yet big, difference between a retry deadlock and a mutex deadlock: another thread can break a retry deadlock. Consider the prototypical mutex deadlock:

//thread #1:
mutex1.lock()
mutex2.lock()

//thread #2:
mutex2.lock()
mutex1.lock()

If we get into deadlock here then any other thread that tries to interact with either of the mutexes involved will also get stuck. On the other hand consider the retry deadlock example above. Suppose we are in the deadlock and a third thread comes along and sets either variable to one. This will break the deadlock, something that is impossible in the mutex case. Really, the retry deadlock is more similar to a condition variable deadlock, e.g.:

//thread #1:
condition1.wait()
condition2.notify()

//thread #2:
condition2.wait()
condition1.notify()

In this case it is also possible for a third thread to come along and break the deadlock by notifying one of the condition variables.

I believe that our never having run into a retry deadlock has more to do with our getting lucky and not inadvertently introducing any cycles into the retry condition graph more than it does with there always being another thread around to break the deadlock (note that it probably isn’t hard to get lucky in this way though). But if you think about it, there’s a bit of a positive feedback loop at work here. You could have cycles in your retry condition graph but if the graph is connected enough in the vicinity of a cycle then there’s a good chance that there is a connection in the graph that will break the deadlock caused by that cycle given enough time.


  1. A quick recap: calling retry will cause your thread to go to sleep until one of the variables that you read from is changed in another thread. When one of these variables is changed your thread is woken up and the transaction starts over.

  2. In the paper’s parlance our retry is equivalent to their atomic(condition).

  3. It seems like there might be a way to statically analyze the program and trace this graph looking for cycles, but aliasing of transactional variables might prove an insurmountable obstacle.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: