引言

正如 Herb Sutter所言 “Free lunch is over.”,程序员坐等处理器升级而提升程序的效率的时代已经过去,随着多核机器的发展,软件要想提升效率,必须由程序员去设计高效的软件其可以并行执行多个任务以提升效率,听起来是个不错的提升效率的方法,但那不是我们今天讨论的重点.正确的并行的设计虽然可以有效的提升我们程序的效率,但是其也涉及到不少问题,数据的交互,线程的返回,线程的同步,程创建的开销,线程切换的开销,原子操作,何时使用多线程等等数不胜数,但是我们今天所讨论的并不是以上问题,今天我们的主角是死锁(deadlock).

死锁是什么

用一个简单生动的例子来说明什么是死锁,首先,你有两个可爱的弟弟,他们一起在沙滩上开心愉快的玩耍,你有一个由两个部分组成的玩具,比如铲子和桶,你的两个弟弟一个拿着桶,另一个拿着铲子.显然他们不仅是可爱的小孩子,更是固执的小孩子,第一个弟弟说,不给我铲子我就不给你桶,另一个说不给我桶就不给你铲子,ok,现在如果没有一个人站出来主动贡献他的玩具,两个人就会都玩不成有趣的铲沙子游戏,现在把孩子换成线程,两个玩具变成锁,显然没有线程会主动放开,这就构成了死锁.

死锁的产生

我们最常见的死锁其实就是N(N取 [1,n] )个互斥元相互锁定,但是不仅是互斥元会产生死锁,其他的操作不当也会产生.比如说两个线程相互join,这样也会造成死锁.其实话说回来,死锁是一种现象,就是说不仅仅是多线程所独有的一种情况,在网络中也可能出现,比较耳熟能详的就是nagle算法中也可能出现死锁传送门
换句话说 死锁可能发生在所有导致循环等待的同步结构中

死锁避免

固定顺序获取锁

这是避免死锁最行之有效的方法,即在多线程间以顺序来获取锁,这很好理解,假设我们有三个锁,分别名为ABC,我们人为的保证A锁定于B之前,B锁定于C之前,其解锁顺序相反,这样必不会出现死锁,现实是美好的,是规则总会被打破,退一万步说,万一程序中锁的顺序出现了问题,在工程大了以后也不好查找,那真是一件很麻烦的事情

层次锁

所谓层次锁其实就是就是一种强制的以固定顺序获取锁我们接下来简单的讨论一下原理,然后写一个C++版的实现,

层次锁原理

原理很简单,上面我们也提到了,层次锁其实就是强制的以固定顺序获取锁,基本思路就是把程序分层,设定每一层可以锁定的互斥元,实现原理就是把锁进行一个简单的包装,给每个锁一个基值,锁只能锁定比它本身的基值小的锁(当然反之也可),以此来以上面举过的例子来说,我们又有了三把锁ABC,A基值为5000,B为3000,C为1000,意味着A可以锁定B,B可以锁定C,但是反之不可,这样会在编译期报错,从而避免死锁

C++实现

理解其中的thread_local限定符至关重要把层次锁调用过程中数据的变化理解为链表,每一个线程是一个节点,这样更好理解些(个人见解)

class hierarchical_mutex{
    private:
        std::mutex internal_mutex;
        uint64_t const hierarchical_value;
        uint64_t previous_value;
        static thread_local uint64_t this_thread_hierarchical_value;

        void check_for_hierarchy() noexcept(false) {
            if(this_thread_hierarchical_value <= hierarchical_value){
                throw logic_error("mutex hierarchical violated.");
            }
        }

        void update_hierarchy_value(){
            previous_value = this_thread_hierarchical_value;
            this_thread_hierarchical_value = hierarchical_value;
        }
    
    public:
        constexpr explicit hierarchical_mutex(uint64_t value) : 
            hierarchical_value(value), previous_value(0) {}

        void lock() noexcept(false) {
            check_for_hierarchy();
            internal_mutex.lock();
            update_hierarchy_value();
        }

        void unlock(){
            this_thread_hierarchical_value = previous_value;
            internal_mutex.unlock();
        } 

        bool try_lock() noexcept(false) {
            check_for_hierarchy();
            if(!internal_mutex.try_lock()) return false;
            update_hierarchy_value();
            return true;
        }
};

thread_local uint64_t 
    hierarchical_mutex::this_thread_hierarchical_value = ULONG_MAX;

hierarchical_mutex high_level_mutex(10000);
hierarchical_mutex low_level_mutex(5000);

void high_level_fun(){
    std::lock_guard<hierarchical_mutex> guard1(high_level_mutex);
    std::lock_guard<hierarchical_mutex> guard2(low_level_mutex);
    std::unique_lock<hierarchical_mutex> a;
    hierarchical_mutex other(6000);
    //std::lock_guard<hierarchical_mutex> guard3(other);
}

int main(){
    auto T = std::thread(high_level_fun);
    T.join();
    return 0;
}

try_lock

其实我在第一次学习到线程这个概念的时候就对try_lock这个函数非常的疑惑,不知道它到底有什么用处,

下面是cppreference对这个函数的解释

Tries to lock the mutex. Returns immediately. On successful lock,acquisition returns true, otherwise returns false.
This function is allowed to fail spuriously and return false even if the mutex is not currently locked by any other thread.
If try_lock is called by a thread that already owns the mutex, the behavior is undefined.
Prior unlock() operation on the same mutex synchronizes-with (as defined in std::memory_order) this operation if it returns true. Note that prior lock() does not synchronize with this operation if it returns false.

我们其实大概的了解到这个函数的作用但其应用场景呢,其中之一就是死锁避免,

如何做到呢,我们想象以下场景,假设现在我们有两个锁,A已经锁定B,假设现在B要锁定A,如果正常的锁定那肯定会出现死锁,所以如果使用try_lock呢,那么会失败,知道这一次锁定是失败的从而退出,但是每次都使用try_lock显然是不现实的,因为多数情况是线程间竞争锁,如果都用try_lock,何谈竞争,所以C++中推出了以下函数

std::lock

我们首先来看看lock的函数定义

template<typename _L1, typename _L2, typename... _L3>
  void
  lock(_L1& __l1, _L2& __l2, _L3&... __l3);

我们可以看到是两个普通参数和一个可变参数模板,很好理解这样设计是为了在大多数情况下提高效率和代码的整洁之间的一个平衡,这个函数是什么意思呢,其实就是全有或全无的语义,很好理解,就是当其参数中的锁中有一个try_lcok失败时就释放所有的锁,已打到全无的语义,从而让另一边进行锁定,以此避免死锁.

demo

class X{
    private:
        std::mutex m;
        vector<int> vec;
    public:
        X(vector<int>& v) : vec(v){}
        friend void swap(X& lhs, X& rhs){
            if(&lhs == &rhs) return;
            std::lock(lhs.m, rhs.m);
            std::lock_guard<std::mutex> lock_a(lhs.m, std::adopt_lock);
            std::lock_guard<std::mutex> lock_b(rhs.m, std::adopt_lock);
            //std::adopt_lock 只有一个作用: 此锁已经上锁 在lock_guard构造函数里面不用给这个锁上锁了
            swap(lhs.vec, rhs.vec);
        }

        size_t show() const{
            return vec.size();
        }
};

vector<int> on(5),tw(10);
X one(on),two(tw);

void test_swap(){
    swap(one, two);
}

int main(){
    auto T = std::thread(test_swap);
    swap(two, two);
    T.join();
    getchar();
    cout << "one : " << one.show() << endl;
    cout << "two : " << two.show() << endl;
    return 0;
}

看起来很好像解放了我们广大C++程序员,要是一切如此美好就好了,这种方法的优点也正是其局限性所在,即只能针对于两个锁在同一位置锁定才可以,否则就不起什么作用了,代码中使用了lock_guard,为了更大的灵活性,我们当然也可以使用unique_lock,

鸵鸟算法

死锁出现概率较低出现时忽略.(有趣)

参考:
<<C++ Concurrency in action>>