Skip to content
  • Robert Richter's avatar
    oprofile: Fix locking dependency in sync_start() · 130c5ce7
    Robert Richter authored
    This fixes the A->B/B->A locking dependency, see the warning below.
    
    The function task_exit_notify() is called with (task_exit_notifier)
    .rwsem set and then calls sync_buffer() which locks buffer_mutex. In
    sync_start() the buffer_mutex was set to prevent notifier functions to
    be started before sync_start() is finished. But when registering the
    notifier, (task_exit_notifier).rwsem is locked too, but now in
    different order than in sync_buffer(). In theory this causes a locking
    dependency, what does not occur in practice since task_exit_notify()
    is always called after the notifier is registered which means the lock
    is already released.
    
    However, after checking the notifier functions it turned out the
    buffer_mutex in sync_start() is unnecessary. This is because
    sync_buffer() may be called from the notifiers even if sync_start()
    did not finish yet, the buffers are already allocated but empty. No
    need to protect this with the mutex.
    
    So we fix this theoretical locking dependency by removing buffer_mutex
    in sync_start(). This is similar to the implementation before commit:
    
     750d857c
    
     oprofile: fix crash when accessing freed task structs
    
    which introduced the locking dependency.
    
    Lockdep warning:
    
    oprofiled/4447 is trying to acquire lock:
     (buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
    
    but task is already holding lock:
     ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 ((task_exit_notifier).rwsem){++++..}:
           [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
           [<ffffffff81463a2b>] down_write+0x44/0x67
           [<ffffffff810581c0>] blocking_notifier_chain_register+0x52/0x8b
           [<ffffffff8105a6ac>] profile_event_register+0x2d/0x2f
           [<ffffffffa00013c1>] sync_start+0x47/0xc6 [oprofile]
           [<ffffffffa00001bb>] oprofile_setup+0x60/0xa5 [oprofile]
           [<ffffffffa00014e3>] event_buffer_open+0x59/0x8c [oprofile]
           [<ffffffff810cd3b9>] __dentry_open+0x1eb/0x308
           [<ffffffff810cd59d>] nameidata_to_filp+0x60/0x67
           [<ffffffff810daad6>] do_last+0x5be/0x6b2
           [<ffffffff810dbc33>] path_openat+0xc7/0x360
           [<ffffffff810dbfc5>] do_filp_open+0x3d/0x8c
           [<ffffffff810ccfd2>] do_sys_open+0x110/0x1a9
           [<ffffffff810cd09e>] sys_open+0x20/0x22
           [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
    
    -> #0 (buffer_mutex){+.+...}:
           [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
           [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
           [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
           [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
           [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
           [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
           [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
           [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
           [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
           [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
           [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
           [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
           [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
    
    other info that might help us debug this:
    
    1 lock held by oprofiled/4447:
     #0:  ((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67
    
    stack backtrace:
    Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10
    Call Trace:
     [<ffffffff81063193>] print_circular_bug+0xae/0xbc
     [<ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
     [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
     [<ffffffff8106557f>] lock_acquire+0xf8/0x11e
     [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
     [<ffffffff81062627>] ? mark_lock+0x42f/0x552
     [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
     [<ffffffff814634f0>] mutex_lock_nested+0x63/0x309
     [<ffffffffa0000e55>] ? sync_buffer+0x31/0x3ec [oprofile]
     [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
     [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
     [<ffffffff81058026>] ? __blocking_notifier_call_chain+0x39/0x67
     [<ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
     [<ffffffff81467b96>] notifier_call_chain+0x37/0x63
     [<ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
     [<ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
     [<ffffffff8105a718>] profile_task_exit+0x1a/0x1c
     [<ffffffff81039e8f>] do_exit+0x2a/0x6fc
     [<ffffffff81465031>] ? retint_swapgs+0xe/0x13
     [<ffffffff8103a5e4>] do_group_exit+0x83/0xae
     [<ffffffff8103a626>] sys_exit_group+0x17/0x1b
     [<ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
    
    Reported-by: default avatarMarcin Slusarz <marcin.slusarz@gmail.com>
    Cc: Carl Love <carll@us.ibm.com>
    Cc: <stable@kernel.org> # .36+
    Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
    130c5ce7