Skip to content
  • Jiri Kosina's avatar
    mm, slab: release slab_mutex earlier in kmem_cache_destroy() · 210ed9de
    Jiri Kosina authored
    Commit 1331e7a1
    
     ("rcu: Remove _rcu_barrier() dependency on
    __stop_machine()") introduced slab_mutex -> cpu_hotplug.lock dependency
    through kmem_cache_destroy() -> rcu_barrier() -> _rcu_barrier() ->
    get_online_cpus().
    
    Lockdep thinks that this might actually result in ABBA deadlock,
    and reports it as below:
    
    === [ cut here ] ===
     ======================================================
     [ INFO: possible circular locking dependency detected ]
     3.6.0-rc5-00004-g0d8ee37 #143 Not tainted
     -------------------------------------------------------
     kworker/u:2/40 is trying to acquire lock:
      (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0
    
     but task is already holding lock:
      (slab_mutex){+.+.+.}, at: [<ffffffff81176e15>] kmem_cache_destroy+0x45/0xe0
    
     which lock already depends on the new lock.
    
     the existing dependency chain (in reverse order) is:
    
     -> #2 (slab_mutex){+.+.+.}:
            [<ffffffff810ae1e2>] validate_chain+0x632/0x720
            [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
            [<ffffffff810ae921>] lock_acquire+0x121/0x190
            [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
            [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
            [<ffffffff81558cb5>] cpuup_callback+0x2f/0xbe
            [<ffffffff81564b83>] notifier_call_chain+0x93/0x140
            [<ffffffff81076f89>] __raw_notifier_call_chain+0x9/0x10
            [<ffffffff8155719d>] _cpu_up+0xba/0x14e
            [<ffffffff815572ed>] cpu_up+0xbc/0x117
            [<ffffffff81ae05e3>] smp_init+0x6b/0x9f
            [<ffffffff81ac47d6>] kernel_init+0x147/0x1dc
            [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10
    
     -> #1 (cpu_hotplug.lock){+.+.+.}:
            [<ffffffff810ae1e2>] validate_chain+0x632/0x720
            [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
            [<ffffffff810ae921>] lock_acquire+0x121/0x190
            [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
            [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
            [<ffffffff81049197>] get_online_cpus+0x37/0x50
            [<ffffffff810f21bb>] _rcu_barrier+0xbb/0x1e0
            [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
            [<ffffffff810f2309>] rcu_barrier+0x9/0x10
            [<ffffffff8118c129>] deactivate_locked_super+0x49/0x90
            [<ffffffff8118cc01>] deactivate_super+0x61/0x70
            [<ffffffff811aaaa7>] mntput_no_expire+0x127/0x180
            [<ffffffff811ab49e>] sys_umount+0x6e/0xd0
            [<ffffffff81569979>] system_call_fastpath+0x16/0x1b
    
     -> #0 (rcu_sched_state.barrier_mutex){+.+...}:
            [<ffffffff810adb4e>] check_prev_add+0x3de/0x440
            [<ffffffff810ae1e2>] validate_chain+0x632/0x720
            [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
            [<ffffffff810ae921>] lock_acquire+0x121/0x190
            [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
            [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
            [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0
            [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
            [<ffffffff810f2309>] rcu_barrier+0x9/0x10
            [<ffffffff81176ea1>] kmem_cache_destroy+0xd1/0xe0
            [<ffffffffa04c3154>] nf_conntrack_cleanup_net+0xe4/0x110 [nf_conntrack]
            [<ffffffffa04c31aa>] nf_conntrack_cleanup+0x2a/0x70 [nf_conntrack]
            [<ffffffffa04c42ce>] nf_conntrack_net_exit+0x5e/0x80 [nf_conntrack]
            [<ffffffff81454b79>] ops_exit_list+0x39/0x60
            [<ffffffff814551ab>] cleanup_net+0xfb/0x1b0
            [<ffffffff8106917b>] process_one_work+0x26b/0x4c0
            [<ffffffff81069f3e>] worker_thread+0x12e/0x320
            [<ffffffff8106f73e>] kthread+0x9e/0xb0
            [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10
    
     other info that might help us debug this:
    
     Chain exists of:
       rcu_sched_state.barrier_mutex --> cpu_hotplug.lock --> slab_mutex
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(slab_mutex);
                                    lock(cpu_hotplug.lock);
                                    lock(slab_mutex);
       lock(rcu_sched_state.barrier_mutex);
    
      *** DEADLOCK ***
    === [ cut here ] ===
    
    This is actually a false positive. Lockdep has no way of knowing the fact
    that the ABBA can actually never happen, because of special semantics of
    cpu_hotplug.refcount and its handling in cpu_hotplug_begin(); the mutual
    exclusion there is not achieved through mutex, but through
    cpu_hotplug.refcount.
    
    The "neither cpu_up() nor cpu_down() will proceed past cpu_hotplug_begin()
    until everyone who called get_online_cpus() will call put_online_cpus()"
    semantics is totally invisible to lockdep.
    
    This patch therefore moves the unlock of slab_mutex so that rcu_barrier()
    is being called with it unlocked. It has two advantages:
    
    - it slightly reduces hold time of slab_mutex; as it's used to protect
      the cachep list, it's not necessary to hold it over kmem_cache_free()
      call any more
    - it silences the lockdep false positive warning, as it avoids lockdep ever
      learning about slab_mutex -> cpu_hotplug.lock dependency
    
    Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
    Acked-by: default avatarDavid Rientjes <rientjes@google.com>
    Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
    Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
    210ed9de