Acquiring Multiple Locks Without Deadlock
Monday, 03 March 2008
In a software system with lots of fine-grained mutexes, it can sometimes be necessary to acquire locks on more than one mutex
together in order to perform some operation. If this is not done with care, then there is the possibility of deadlock, as multiple
threads may lock the same mutexes in a different order. It is for this reason that the thread library coming with C++0x will include
a lock()
function for locking multiple mutexes together: this article describes the implementation details behind such
a function.
Choose the lock order by role
The easiest way to deal with this is to always lock the mutexes in the same order. This is especially easy if the order can be hard-coded, and some uses naturally lend themselves towards this choice. For example, if the mutexes protect objects with different roles, it is relatively easy to always lock the mutex protecting one set of data before locking the other one. In such a situation, Lock hierarchies can be used to enforce the ordering — with a lock hierarchy, a thread cannot acquire a lock on a mutex with a higher hierarchy level than any mutexes currently locked by that thread.
If it is not possible to decide a-priori which mutex to lock first, such as when the mutexes are associated with the same sort of data, then a more complicated policy must be applied.
Choose the lock order by address
The simplest technique in these cases is to always lock the mutexes in ascending order of address (examples use the types and functions from the upcoming 1.35 release of Boost), like this:
void lock(boost::mutex& m1,boost::mutex& m2) { if(&m1<&m2) { m1.lock(); m2.lock(); } else { m2.lock(); m1.lock(); } }
This works for small numbers of mutexes, provided this policy is maintained throughout the application, but if several mutexes must be locked together, then calculating the ordering can get complicated, and potentially inefficient. It also requires that the mutexes are all of the same type. Since there are many possible mutex and lock types that an application might choose to use, this is a notable disadvantage, as the function must be written afresh for each possible combination.
Order mutexes "naturally", with try-and-back-off
If the mutexes cannot be ordered by address (for whatever reason), then an alternative scheme must be found. One such scheme is to use a try-and-back-off algorithm: try and lock each mutex in turn; if any cannot be locked, unlock the others and start again. The simplest implementation for 3 mutexes looks like this:
void lock(boost::mutex& m1,boost::mutex& m2,boost::mutex& m3) { do { m1.lock(); if(m2.try_lock()) { if(m3.try_lock()) { return; } m2.unlock(); } m1.unlock(); } while(true); }
Wait for the failed mutex
The big problem with this scheme is that it always locks the mutexes in the same order. If m1
and m2
are currently free, but m3
is locked by another thread, then this thread will repeatedly lock m1
and
m2
, fail to lock m3
and unlock m1
and m2
. This just wastes CPU cycles for no
gain. Instead, what we want to do is block waiting for m3
, and try to acquire the others only when
m3
has been successfully locked by this thread. For three mutexes, a first attempt looks like this:
void lock(boost::mutex& m1,boost::mutex& m2,boost::mutex& m3) { unsigned lock_first=0; while(true) { switch(lock_first) { case 0: m1.lock(); if(m2.try_lock()) { if(m3.try_lock()) return; lock_first=2; m2.unlock(); } else { lock_first=1; } m1.unlock(); break; case 1: m2.lock(); if(m3.try_lock()) { if(m1.try_lock()) return; lock_first=0; m3.unlock(); } else { lock_first=2; } m2.unlock(); break; case 2: m3.lock(); if(m1.try_lock()) { if(m2.try_lock()) return; lock_first=1; m1.unlock(); } else { lock_first=0; } m3.unlock(); break; } } }
Simplicity and Robustness
This code is very long-winded, with all the duplication between the case
blocks. Also, it assumes that the mutexes
are all boost::mutex
, which is overly restrictive. Finally, it assumes that the try_lock
calls don't throw
exceptions. Whilst this is true for the Boost mutexes, it is not required to be true in general, so a more robust implementation
that allows the mutex type to be supplied as a template parameter will ensure that any exceptions thrown will leave all the mutexes
unlocked: the unique_lock
template will help with that by providing RAII locking. Taking all this into account leaves
us with the following:
template<typename MutexType1,typename MutexType2,typename MutexType3> unsigned lock_helper(MutexType1& m1,MutexType2& m2,MutexType3& m3) { boost::unique_lock<MutexType1> l1(m1); boost::unique_lock<MutexType2> l2(m2,boost::try_to_lock); if(!l2) { return 1; } if(!m3.try_lock()) { return 2; } l2.release(); l1.release(); return 0; } template<typename MutexType1,typename MutexType2,typename MutexType3> void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3) { unsigned lock_first=0; while(true) { switch(lock_first) { case 0: lock_first=lock_helper(m1,m2,m3); if(!lock_first) return; break; case 1: lock_first=lock_helper(m2,m3,m1); if(!lock_first) return; lock_first=(lock_first+1)%3; break; case 2: lock_first=lock_helper(m3,m1,m2); if(!lock_first) return; lock_first=(lock_first+2)%3; break; } } }
This code is simultaneously shorter, simpler and more general than the previous implementation, and is robust in the face of
exceptions. The lock_helper
function locks the first mutex, and then tries to lock the other two in turn. If either of
the try_lock
s fail, then all currently-locked mutexes are unlocked, and it returns the index of the mutex than couldn't
be locked. On success, the release
members of the unique_lock
instances are called to release ownership of
the locks, and thus stop them automatically unlocking the mutexes during destruction, and 0
is returned. The outer
lock
function is just a simple wrapper around lock_helper
that chooses the order of the mutexes so that
the one that failed to lock last time is tried first.
Extending to more mutexes
This scheme can also be easily extended to handle more mutexes, though the code gets unavoidably longer, since there are more cases to handle — this is where the C++0x variadic templates will really come into their own. Here's the code for locking 5 mutexes together:
template<typename MutexType1,typename MutexType2,typename MutexType3, typename MutexType4,typename MutexType5> unsigned lock_helper(MutexType1& m1,MutexType2& m2,MutexType3& m3, MutexType4& m4,MutexType5& m5) { boost::unique_lock<MutexType1> l1(m1); boost::unique_lock<MutexType2> l2(m2,boost::try_to_lock); if(!l2) { return 1; } boost::unique_lock<MutexType3> l3(m3,boost::try_to_lock); if(!l3) { return 2; } boost::unique_lock<MutexType4> l2(m4,boost::try_to_lock); if(!l4) { return 3; } if(!m5.try_lock()) { return 4; } l4.release(); l3.release(); l2.release(); l1.release(); return 0; } template<typename MutexType1,typename MutexType2,typename MutexType3, typename MutexType4,typename MutexType5> void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3, MutexType4& m4,MutexType5& m5) { unsigned const lock_count=5; unsigned lock_first=0; while(true) { switch(lock_first) { case 0: lock_first=lock_helper(m1,m2,m3,m4,m5); if(!lock_first) return; break; case 1: lock_first=lock_helper(m2,m3,m4,m5,m1); if(!lock_first) return; lock_first=(lock_first+1)%lock_count; break; case 2: lock_first=lock_helper(m3,m4,m5,m1,m2); if(!lock_first) return; lock_first=(lock_first+2)%lock_count; break; case 3: lock_first=lock_helper(m4,m5,m1,m2,m3); if(!lock_first) return; lock_first=(lock_first+3)%lock_count; break; case 4: lock_first=lock_helper(m5,m1,m2,m3,m4); if(!lock_first) return; lock_first=(lock_first+4)%lock_count; break; } } }
Final Code
The final code for acquiring multiple locks
provides try_lock
and lock
functions for 2 to 5 mutexes. Though the try_lock
functions are
relatively straight-forward, their existence makes the lock_helper
functions slightly simpler, as they can just defer
to the appropriate overload of try_lock
to cover all the mutexes beyond the first one.
Posted by Anthony Williams
[/ threading /] permanent link
Tags: threading, concurrency, mutexes, locks
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Design and Content Copyright © 2005-2024 Just Software Solutions Ltd. All rights reserved. | Privacy Policy
4 Comments
Hi,
my concern about acquiring multiple locks in this way is the starvation problem and the performance hit. Say I have 2 locks and 2 threads using lock(boost::mutex& m1,boost::mutex& m2). And lets say they are passed into the lock function in a different order. First one gets m1, while the other one gets m2. Then they both try to lock the other mutex, fail to do so and release both mutexes. Then the first one locks m2, the second one locks m1 and the circle repeats again... what's preventing them from forever alternating trying to acquire the locks? In fact, they don't even necessarily have to alternate forever in order for this algorithm to be slower than just defining the order of mutexes.
Or, take a different situation... Lets say you have 5 mutexes and n threads of which n-1 threads are using the predefined order of acquiring mutexes (i.e. acquiring them one by one), and 1 thread using the lock(MutexType1& m1,MutexType2& m2,MutexType3& m3, MutexType4& m4,MutexType5& m5). The probability of that one thread getting all 5 locks and it's chance to run seem to be much lower compared to the other threads... Yes, it will never deadlock, but it will be treated unfairly compared to other threads.
I guess my question is: what are the benefits of using the lock function with multiple mutexes ? It will not deadlock, but is there anything else I'm missing ? To me it just seems to be less effective than defining the hierarchy of threads, so the only time I would want to use it is if I'm unaware of the hierarchy for what ever reason. And even then, I would try hard to find out before accepting the use of this type of lock. :)
Thanks !
There is nothing preventing two threads using lock() from live-locking as you describe. However, this is actually unlikely: the chances are that the vagaries of the scheduler will allow one thread to acquire the locks before the other thread gets in there first.
If you are in a position where you *can* define the lock order, it is always better to do so. lock() just enables you to safely acquire multiple locks in circumstances where you cannot define the order.
Hi,
I'm trying to understand the source code of boost/thread/lock_algorithms and boost/thread/barrier. I have one question.
in the barrier::wait, why use while loop like
while (gen == m_generation) m_cond.wait(lock);
? it seems m_cond would not notify until m_generation being changed, can it only call m_cond.wait(lock) without while loop?
Thanks!
Zudeng: Whenever you use a condition variable wait without an explicit predicate, you need to wrap it in a loop to handle spurious wakes. See https://www.justsoftwaresolutions.co.uk/threading/condition-variable-spurious-wakes.html