Skip to content
  • Nick Piggin's avatar
    lockdep: annotate reclaim context (__GFP_NOFS) · cf40bd16
    Nick Piggin authored
    Here is another version, with the incremental patch rolled up, and
    added reclaim context annotation to kswapd, and allocation tracing
    to slab allocators (which may only ever reach the page allocator
    in rare cases, so it is good to put annotations here too).
    
    Haven't tested this version as such, but it should be getting closer
    to merge worthy ;)
    
    --
    After noticing some code in mm/filemap.c accidentally perform a __GFP_FS
    allocation when it should not have been, I thought it might be a good idea to
    try to catch this kind of thing with lockdep.
    
    I coded up a little idea that seems to work. Unfortunately the system has to
    actually be in __GFP_FS page reclaim, then take the lock, before it will mark
    it. But at least that might still be some orders of magnitude more common
    (and more debuggable) than an actual deadlock condition, so we have some
    improvement I hope (the concept is no less complete than discovery of a lock's
    interrupt contexts).
    
    I guess we could even do the same thing with __GFP_IO (normal reclaim), and
    even GFP_NOIO locks too... but filesystems will have the most locks and fiddly
    code paths, so let's start there and see how it goes.
    
    It *seems* to work. I did a quick test.
    
    =================================
    [ INFO: inconsistent lock state ]
    2.6.28-rc6-00007-ged313489-dirty #26
    ---------------------------------
    inconsistent {in-reclaim-W} -> {ov-reclaim-W} usage.
    modprobe/8526 [HC0[0]:SC0[0]:HE1:SE1] takes:
     (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
    {in-reclaim-W} state was registered at:
      [<ffffffff80267bdb>] __lock_acquire+0x75b/0x1a60
      [<ffffffff80268f71>] lock_acquire+0x91/0xc0
      [<ffffffff8070f0e1>] mutex_lock_nested+0xb1/0x310
      [<ffffffffa002002b>] brd_init+0x2b/0x216 [brd]
      [<ffffffff8020903b>] _stext+0x3b/0x170
      [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
      [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
      [<ffffffffffffffff>] 0xffffffffffffffff
    irq event stamp: 3929
    hardirqs last  enabled at (3929): [<ffffffff8070f2b5>] mutex_lock_nested+0x285/0x310
    hardirqs last disabled at (3928): [<ffffffff8070f089>] mutex_lock_nested+0x59/0x310
    softirqs last  enabled at (3732): [<ffffffff8061f623>] sk_filter+0x83/0xe0
    softirqs last disabled at (3730): [<ffffffff8061f5b6>] sk_filter+0x16/0xe0
    
    other info that might help us debug this:
    1 lock held by modprobe/8526:
     #0:  (testlock){--..}, at: [<ffffffffa0020055>] brd_init+0x55/0x216 [brd]
    
    stack backtrace:
    Pid: 8526, comm: modprobe Not tainted 2.6.28-rc6-00007-ged313489
    
    -dirty #26
    Call Trace:
     [<ffffffff80265483>] print_usage_bug+0x193/0x1d0
     [<ffffffff80266530>] mark_lock+0xaf0/0xca0
     [<ffffffff80266735>] mark_held_locks+0x55/0xc0
     [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
     [<ffffffff802667ca>] trace_reclaim_fs+0x2a/0x60
     [<ffffffff80285005>] __alloc_pages_internal+0x475/0x580
     [<ffffffff8070f29e>] ? mutex_lock_nested+0x26e/0x310
     [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
     [<ffffffffa002006a>] brd_init+0x6a/0x216 [brd]
     [<ffffffffa0020000>] ? brd_init+0x0/0x216 [brd]
     [<ffffffff8020903b>] _stext+0x3b/0x170
     [<ffffffff8070f8b9>] ? mutex_unlock+0x9/0x10
     [<ffffffff8070f83d>] ? __mutex_unlock_slowpath+0x10d/0x180
     [<ffffffff802669ec>] ? trace_hardirqs_on_caller+0x12c/0x190
     [<ffffffff80272ebf>] sys_init_module+0xaf/0x1e0
     [<ffffffff8020c3fb>] system_call_fastpath+0x16/0x1b
    
    Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    cf40bd16