Skip to content
  • David Hildenbrand's avatar
    hotplugcpu: Avoid deadlocks by waking active_writer · 87af9e7f
    David Hildenbrand authored
    Commit b2c4623d
    
     ("rcu: More on deadlock between CPU hotplug and expedited
    grace periods") introduced another problem that can easily be reproduced by
    starting/stopping cpus in a loop.
    
    E.g.:
      for i in `seq 5000`; do
          echo 1 > /sys/devices/system/cpu/cpu1/online
          echo 0 > /sys/devices/system/cpu/cpu1/online
      done
    
    Will result in:
      INFO: task /cpu_start_stop:1 blocked for more than 120 seconds.
      Call Trace:
      ([<00000000006a028e>] __schedule+0x406/0x91c)
       [<0000000000130f60>] cpu_hotplug_begin+0xd0/0xd4
       [<0000000000130ff6>] _cpu_up+0x3e/0x1c4
       [<0000000000131232>] cpu_up+0xb6/0xd4
       [<00000000004a5720>] device_online+0x80/0xc0
       [<00000000004a57f0>] online_store+0x90/0xb0
      ...
    
    And a deadlock.
    
    Problem is that if the last ref in put_online_cpus() can't get the
    cpu_hotplug.lock the puts_pending count is incremented, but a sleeping
    active_writer might never be woken up, therefore never exiting the loop in
    cpu_hotplug_begin().
    
    This fix removes puts_pending and turns refcount into an atomic variable. We
    also introduce a wait queue for the active_writer, to avoid possible races and
    use-after-free. There is no need to take the lock in put_online_cpus() anymore.
    
    Can't reproduce it with this fix.
    
    Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    87af9e7f