Acquiring Multiple Locks Without Deadlock

Monday, 03 March 2008

In a software system with lots of fine-grained mutexes, it can sometimes be necessary to acquire locks on more than one mutex together in order to perform some operation. If this is not done with care, then there is the possibility of deadlock, as multiple threads may lock the same mutexes in a different order. It is for this reason that the thread library coming with C++0x will include a lock() function for locking multiple mutexes together: this article describes the implementation details behind such a function.

Choose the lock order by role

The easiest way to deal with this is to always lock the mutexes in the same order. This is especially easy if the order can be hard-coded, and some uses naturally lend themselves towards this choice. For example, if the mutexes protect objects with different roles, it is relatively easy to always lock the mutex protecting one set of data before locking the other one. In such a situation, Lock hierarchies can be used to enforce the ordering — with a lock hierarchy, a thread cannot acquire a lock on a mutex with a higher hierarchy level than any mutexes currently locked by that thread.

If it is not possible to decide a-priori which mutex to lock first, such as when the mutexes are associated with the same sort of data, then a more complicated policy must be applied.

Choose the lock order by address

The simplest technique in these cases is to always lock the mutexes in ascending order of address (examples use the types and functions from the upcoming 1.35 release of Boost), like this:

void lock(boost::mutex& m1,boost::mutex& m2)
{
    if(&m1<&m2)
    {
        m1.lock();
        m2.lock();
    }
    else
    {
        m2.lock();
        m1.lock();
    }
}

This works for small numbers of mutexes, provided this policy is maintained throughout the application, but if several mutexes must be locked together, then calculating the ordering can get complicated, and potentially inefficient. It also requires that the mutexes are all of the same type. Since there are many possible mutex and lock types that an application might choose to use, this is a notable disadvantage, as the function must be written afresh for each possible combination.

Order mutexes "naturally", with try-and-back-off

If the mutexes cannot be ordered by address (for whatever reason), then an alternative scheme must be found. One such scheme is to use a try-and-back-off algorithm: try and lock each mutex in turn; if any cannot be locked, unlock the others and start again. The simplest implementation for 3 mutexes looks like this:

void lock(boost::mutex& m1,boost::mutex& m2,boost::mutex& m3)
{
    do
    {
        m1.lock();
        if(m2.try_lock())
        {
            if(m3.try_lock())
            {
                return;
            }
            m2.unlock();
        }
        m1.unlock();
    }
    while(true);
}

Wait for the failed mutex

The big problem with this scheme is that it always locks the mutexes in the same order. If m1 and m2 are currently free, but m3 is locked by another thread, then this thread will repeatedly lock m1 and m2, fail to lock m3 and unlock m1 and m2. This just wastes CPU cycles for no gain. Instead, what we want to do is block waiting for m3, and try to acquire the others only when m3 has been successfully locked by this thread. For three mutexes, a first attempt looks like this:

void lock(boost::mutex& m1,boost::mutex& m2,boost::mutex& m3)
{
    unsigned lock_first=0;
    while(true)
    {
        switch(lock_first)
        {
        case 0:
            m1.lock();
            if(m2.try_lock())
            {
                if(m3.try_lock())
                    return;
                lock_first=2;
                m2.unlock();
            }
            else
            {
                lock_first=1;
            }
            m1.unlock();
            break;
        case 1:
            m2.lock();
            if(m3.try_lock())
            {
                if(m1.try_lock())
                    return;
                lock_first=0;
                m3.unlock();
            }
            else
            {
                lock_first=2;
            }
            m2.unlock();
            break;
        case 2:
            m3.lock();
            if(m1.try_lock())
            {
                if(m2.try_lock())
                    return;
                lock_first=1;
                m1.unlock();
            }
            else
            {
                lock_first=0;
            }
            m3.unlock();
            break;
        }
    }
}

Simplicity and Robustness

This code is very long-winded, with all the duplication between the case blocks. Also, it assumes that the mutexes are all boost::mutex, which is overly restrictive. Finally, it assumes that the try_lock calls don't throw exceptions. Whilst this is true for the Boost mutexes, it is not required to be true in general, so a more robust implementation that allows the mutex type to be supplied as a template parameter will ensure that any exceptions thrown will leave all the mutexes unlocked: the unique_lock template will help with that by providing RAII locking. Taking all this into account leaves us with the following:

template<typename MutexType1,typename MutexType2,typename MutexType3>
unsigned lock_helper(MutexType1& m1,MutexType2& m2,MutexType3& m3)
{
    boost::unique_lock<MutexType1> l1(m1);
    boost::unique_lock<MutexType2> l2(m2,boost::try_to_lock);
    if(!l2)
    {
        return 1;
    }
    if(!m3.try_lock())
    {
        return 2;
    }
    l2.release();
    l1.release();
    return 0;
}

template<typename MutexType1,typename MutexType2,typename MutexType3>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3)
{
    unsigned lock_first=0;
    while(true)
    {
        switch(lock_first)
        {
        case 0:
            lock_first=lock_helper(m1,m2,m3);
            if(!lock_first)
                return;
            break;
        case 1:
            lock_first=lock_helper(m2,m3,m1);
            if(!lock_first)
                return;
            lock_first=(lock_first+1)%3;
            break;
        case 2:
            lock_first=lock_helper(m3,m1,m2);
            if(!lock_first)
                return;
            lock_first=(lock_first+2)%3;
            break;
        }
    }
}

This code is simultaneously shorter, simpler and more general than the previous implementation, and is robust in the face of exceptions. The lock_helper function locks the first mutex, and then tries to lock the other two in turn. If either of the try_locks fail, then all currently-locked mutexes are unlocked, and it returns the index of the mutex than couldn't be locked. On success, the release members of the unique_lock instances are called to release ownership of the locks, and thus stop them automatically unlocking the mutexes during destruction, and 0 is returned. The outer lock function is just a simple wrapper around lock_helper that chooses the order of the mutexes so that the one that failed to lock last time is tried first.

Extending to more mutexes

This scheme can also be easily extended to handle more mutexes, though the code gets unavoidably longer, since there are more cases to handle — this is where the C++0x variadic templates will really come into their own. Here's the code for locking 5 mutexes together:

template<typename MutexType1,typename MutexType2,typename MutexType3,
         typename MutexType4,typename MutexType5>
unsigned lock_helper(MutexType1& m1,MutexType2& m2,MutexType3& m3,
                     MutexType4& m4,MutexType5& m5)
{
    boost::unique_lock<MutexType1> l1(m1);
    boost::unique_lock<MutexType2> l2(m2,boost::try_to_lock);
    if(!l2)
    {
        return 1;
    }
    boost::unique_lock<MutexType3> l3(m3,boost::try_to_lock);
    if(!l3)
    {
        return 2;
    }
    boost::unique_lock<MutexType4> l2(m4,boost::try_to_lock);
    if(!l4)
    {
        return 3;
    }
    if(!m5.try_lock())
    {
        return 4;
    }
    l4.release();
    l3.release();
    l2.release();
    l1.release();
    return 0;
}

template<typename MutexType1,typename MutexType2,typename MutexType3,
         typename MutexType4,typename MutexType5>
void lock(MutexType1& m1,MutexType2& m2,MutexType3& m3,
          MutexType4& m4,MutexType5& m5)
{
    unsigned const lock_count=5;
    unsigned lock_first=0;
    while(true)
    {
        switch(lock_first)
        {
        case 0:
            lock_first=lock_helper(m1,m2,m3,m4,m5);
            if(!lock_first)
                return;
            break;
        case 1:
            lock_first=lock_helper(m2,m3,m4,m5,m1);
            if(!lock_first)
                return;
            lock_first=(lock_first+1)%lock_count;
            break;
        case 2:
            lock_first=lock_helper(m3,m4,m5,m1,m2);
            if(!lock_first)
                return;
            lock_first=(lock_first+2)%lock_count;
            break;
        case 3:
            lock_first=lock_helper(m4,m5,m1,m2,m3);
            if(!lock_first)
                return;
            lock_first=(lock_first+3)%lock_count;
            break;
        case 4:
            lock_first=lock_helper(m5,m1,m2,m3,m4);
            if(!lock_first)
                return;
            lock_first=(lock_first+4)%lock_count;
            break;
        }
    }
}

Final Code

The final code for acquiring multiple locks provides try_lock and lock functions for 2 to 5 mutexes. Though the try_lock functions are relatively straight-forward, their existence makes the lock_helper functions slightly simpler, as they can just defer to the appropriate overload of try_lock to cover all the mutexes beyond the first one.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: threading, concurrency, mutexes, locks
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

4 Comments

Hi,

my concern about acquiring multiple locks in this way is the starvation problem and the performance hit. Say I have 2 locks and 2 threads using lock(boost::mutex& m1,boost::mutex& m2). And lets say they are passed into the lock function in a different order. First one gets m1, while the other one gets m2. Then they both try to lock the other mutex, fail to do so and release both mutexes. Then the first one locks m2, the second one locks m1 and the circle repeats again... what's preventing them from forever alternating trying to acquire the locks? In fact, they don't even necessarily have to alternate forever in order for this algorithm to be slower than just defining the order of mutexes.

Or, take a different situation... Lets say you have 5 mutexes and n threads of which n-1 threads are using the predefined order of acquiring mutexes (i.e. acquiring them one by one), and 1 thread using the lock(MutexType1& m1,MutexType2& m2,MutexType3& m3, MutexType4& m4,MutexType5& m5). The probability of that one thread getting all 5 locks and it's chance to run seem to be much lower compared to the other threads... Yes, it will never deadlock, but it will be treated unfairly compared to other threads.

I guess my question is: what are the benefits of using the lock function with multiple mutexes ? It will not deadlock, but is there anything else I'm missing ? To me it just seems to be less effective than defining the hierarchy of threads, so the only time I would want to use it is if I'm unaware of the hierarchy for what ever reason. And even then, I would try hard to find out before accepting the use of this type of lock. :)

Thanks !

by Dinka at 15:00:33 on Monday, 21 January 2019

There is nothing preventing two threads using lock() from live-locking as you describe. However, this is actually unlikely: the chances are that the vagaries of the scheduler will allow one thread to acquire the locks before the other thread gets in there first.

If you are in a position where you *can* define the lock order, it is always better to do so. lock() just enables you to safely acquire multiple locks in circumstances where you cannot define the order.

by Anthony Williams at 15:00:33 on Monday, 21 January 2019

Hi,

I'm trying to understand the source code of boost/thread/lock_algorithms and boost/thread/barrier. I have one question.

in the barrier::wait, why use while loop like

while (gen == m_generation) m_cond.wait(lock);

? it seems m_cond would not notify until m_generation being changed, can it only call m_cond.wait(lock) without while loop?

Thanks!

by Zudeng at 15:00:33 on Monday, 21 January 2019

Zudeng: Whenever you use a condition variable wait without an explicit predicate, you need to wrap it in a loop to handle spurious wakes. See https://www.justsoftwaresolutions.co.uk/threading/condition-variable-spurious-wakes.html

by Anthony Williams at 15:00:33 on Monday, 21 January 2019

Just Software Solutions

About Us

Technical Writings

Subscribe to Blog

Blog Archives