Skip to content
  • Tejun Heo's avatar
    cgroup_freezer: replace freezer->lock with freezer_mutex · e5ced8eb
    Tejun Heo authored
    After 96d365e0 ("cgroup: make css_set_lock a rwsem and rename it
    to css_set_rwsem"), css task iterators requires sleepable context as
    it may block on css_set_rwsem.  I missed that cgroup_freezer was
    iterating tasks under IRQ-safe spinlock freezer->lock.  This leads to
    errors like the following on freezer state reads and transitions.
    
      BUG: sleeping function called from invalid context at /work
     /os/work/kernel/locking/rwsem.c:20
      in_atomic(): 0, irqs_disabled(): 0, pid: 462, name: bash
      5 locks held by bash/462:
       #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff811f0843>] vfs_write+0x1a3/0x1c0
       #1:  (&of->mutex){+.+.+.}, at: [<ffffffff8126d78b>] kernfs_fop_write+0xbb/0x170
       #2:  (s_active#70){.+.+.+}, at: [<ffffffff8126d793>] kernfs_fop_write+0xc3/0x170
       #3:  (freezer_mutex){+.+...}, at: [<ffffffff81135981>] freezer_write+0x61/0x1e0
       #4:  (rcu_read_lock){......}, at: [<ffffffff81135973>] freezer_write+0x53/0x1e0
      Preemption disabled at:[<ffffffff81104404>] console_unlock+0x1e4/0x460
    
      CPU: 3 PID: 462 Comm: bash Not tainted 3.15.0-rc1-work+ #10
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       ffff88000916a6d0 ffff88000e0a3da0 ffffffff81cf8c96 0000000000000000
       ffff88000e0a3dc8 ffffffff810cf4f2 ffffffff82388040 ffff880013aaf740
       0000000000000002 ffff88000e0a3de8 ffffffff81d05974 0000000000000246
      Call Trace:
       [<ffffffff81cf8c96>] dump_stack+0x4e/0x7a
       [<ffffffff810cf4f2>] __might_sleep+0x162/0x260
       [<ffffffff81d05974>] down_read+0x24/0x60
       [<ffffffff81133e87>] css_task_iter_start+0x27/0x70
       [<ffffffff8113584d>] freezer_apply_state+0x5d/0x130
       [<ffffffff81135a16>] freezer_write+0xf6/0x1e0
       [<ffffffff8112eb88>] cgroup_file_write+0xd8/0x230
       [<ffffffff8126d7b7>] kernfs_fop_write+0xe7/0x170
       [<ffffffff811f0756>] vfs_write+0xb6/0x1c0
       [<ffffffff811f121d>] SyS_write+0x4d/0xc0
       [<ffffffff81d08292>] system_call_fastpath+0x16/0x1b
    
    freezer->lock used to be used in hot paths but that time is long gone
    and there's no reason for the lock to be IRQ-safe spinlock or even
    per-cgroup.  In fact, given the fact that a cgroup may contain large
    number of tasks, it's not a good idea to iterate over them while
    holding IRQ-safe spinlock.
    
    Let's simplify locking by replacing per-cgroup freezer->lock with
    global freezer_mutex.  This also makes the comments explaining the
    intricacies of policy inheritance and the locking around it as the
    states are protected by a common mutex.
    
    The conversion is mostly straight-forward.  The followings are worth
    mentioning.
    
    * freezer_css_online() no longer needs double locking.
    
    * freezer_attach() now performs propagation simply while holding
      freezer_mutex.  update_if_frozen() race no longer exists and the
      comment is removed.
    
    * freezer_fork() now tests whether the task is in root cgroup using
      the new task_css_is_root() without doing rcu_read_lock/unlock().  If
      not, it grabs freezer_mutex and performs the operation.
    
    * freezer_read() and freezer_change_state() grab freezer_mutex across
      the whole operation and pin the css while iterating so that each
      descendant processing happens in sleepable context.
    
    Fixes: 96d365e0
    
     ("cgroup: make css_set_lock a rwsem and rename it to css_set_rwsem")
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Acked-by: default avatarLi Zefan <lizefan@huawei.com>
    e5ced8eb