Skip to content
Snippets Groups Projects
  1. Jul 19, 2007
    • Peter Zijlstra's avatar
      lockstat: measure lock bouncing · 96645678
      Peter Zijlstra authored
      
          __acquire
              |
             lock _____
              |        \
              |    __contended
              |         |
              |        wait
              | _______/
              |/
              |
         __acquired
              |
         __release
              |
           unlock
      
      We measure acquisition and contention bouncing.
      
      This is done by recording a cpu stamp in each lock instance.
      
      Contention bouncing requires the cpu stamp to be set on acquisition. Hence we
      move __acquired into the generic path.
      
      __acquired is then used to measure acquisition bouncing by comparing the
      current cpu with the old stamp before replacing it.
      
      __contended is used to measure contention bouncing (only useful for preemptable
      locks)
      
      [akpm@linux-foundation.org: cleanups]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96645678
    • Peter Zijlstra's avatar
      lockdep: various fixes · 4b32d0a4
      Peter Zijlstra authored
      
       - update the copyright notices
       - use the default hash function
       - fix a thinko in a BUILD_BUG_ON
       - add a WARN_ON to spot inconsitent naming
       - fix a termination issue in /proc/lock_stat
      
      [akpm@linux-foundation.org: cleanups]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b32d0a4
    • Peter Zijlstra's avatar
      lockstat: human readability tweaks · c46261de
      Peter Zijlstra authored
      
      Present all this fancy new lock statistics information:
      
      *warning, _wide_ output ahead*
      
      (output edited for purpose of brevity)
      
       # cat /proc/lock_stat
      lock_stat version 0.1
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
                                    class name    contentions   waittime-min   waittime-max waittime-total   acquisitions   holdtime-min   holdtime-max holdtime-total
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
      
                               &inode->i_mutex:         14458           6.57      398832.75     2469412.23        6768876           0.34    11398383.65   339410830.89
                               ---------------
                               &inode->i_mutex           4486          [<ffffffff802a08f9>] pipe_wait+0x86/0x8d
                               &inode->i_mutex              0          [<ffffffff802a01e8>] pipe_write_fasync+0x29/0x5d
                               &inode->i_mutex              0          [<ffffffff802a0e18>] pipe_read+0x74/0x3a5
                               &inode->i_mutex              0          [<ffffffff802a1a6a>] do_lookup+0x81/0x1ae
      
      .................................................................................................................................................................
      
                    &inode->i_data.tree_lock-W:           491           0.27          62.47         493.89        2477833           0.39         468.89     1146584.25
                    &inode->i_data.tree_lock-R:            65           0.44           4.27          48.78       26288792           0.36         184.62    10197458.24
                    --------------------------
                      &inode->i_data.tree_lock             46          [<ffffffff80277095>] __do_page_cache_readahead+0x69/0x24f
                      &inode->i_data.tree_lock             31          [<ffffffff8026f9fb>] add_to_page_cache+0x31/0xba
                      &inode->i_data.tree_lock              0          [<ffffffff802770ee>] __do_page_cache_readahead+0xc2/0x24f
                      &inode->i_data.tree_lock              0          [<ffffffff8026f6e4>] find_get_page+0x1a/0x58
      
      .................................................................................................................................................................
      
                            proc_inum_idr.lock:             0           0.00           0.00           0.00             36           0.00          65.60         148.26
                              proc_subdir_lock:             0           0.00           0.00           0.00        3049859           0.00         106.81     1563212.42
                              shrinker_rwsem-W:             0           0.00           0.00           0.00              5           0.00           1.73           3.68
                              shrinker_rwsem-R:             0           0.00           0.00           0.00            633           2.57         246.57       10909.76
      
      'contentions' and 'acquisitions' are the number of such events measured (since
      the last reset). The waittime- and holdtime- (min, max, total) numbers are
      presented in microseconds.
      
      If there are any contention points, the lock class is presented in the block
      format (as i_mutex and tree_lock above), otherwise a single line of output is
      presented.
      
      The output is sorted on absolute number of contentions (read + write), this
      should get the worst offenders presented first, so that:
      
       # grep : /proc/lock_stat | head
      
      will quickly show who's bad.
      
      The stats can be reset using:
      
       # echo 0 > /proc/lock_stat
      
      [bunk@stusta.de: make 2 functions static]
      [akpm@linux-foundation.org: fix printk warning]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c46261de
    • Peter Zijlstra's avatar
      lockstat: core infrastructure · f20786ff
      Peter Zijlstra authored
      
      Introduce the core lock statistics code.
      
      Lock statistics provides lock wait-time and hold-time (as well as the count
      of corresponding contention and acquisitions events). Also, the first few
      call-sites that encounter contention are tracked.
      
      Lock wait-time is the time spent waiting on the lock. This provides insight
      into the locking scheme, that is, a heavily contended lock is indicative of
      a too coarse locking scheme.
      
      Lock hold-time is the duration the lock was held, this provides a reference for
      the wait-time numbers, so they can be put into perspective.
      
        1)
          lock
        2)
          ... do stuff ..
          unlock
        3)
      
      The time between 1 and 2 is the wait-time. The time between 2 and 3 is the
      hold-time.
      
      The lockdep held-lock tracking code is reused, because it already collects locks
      into meaningful groups (classes), and because it is an existing infrastructure
      for lock instrumentation.
      
      Currently lockdep tracks lock acquisition with two hooks:
      
        lock()
          lock_acquire()
          _lock()
      
       ... code protected by lock ...
      
        unlock()
          lock_release()
          _unlock()
      
      We need to extend this with two more hooks, in order to measure contention.
      
        lock_contended() - used to measure contention events
        lock_acquired()  - completion of the contention
      
      These are then placed the following way:
      
        lock()
          lock_acquire()
          if (!_try_lock())
            lock_contended()
            _lock()
            lock_acquired()
      
       ... do locked stuff ...
      
        unlock()
          lock_release()
          _unlock()
      
      (Note: the try_lock() 'trick' is used to avoid instrumenting all platform
             dependent lock primitive implementations.)
      
      It is also possible to toggle the two lockdep features at runtime using:
      
        /proc/sys/kernel/prove_locking
        /proc/sys/kernel/lock_stat
      
      (esp. turning off the O(n^2) prove_locking functionaliy can help)
      
      [akpm@linux-foundation.org: build fixes]
      [akpm@linux-foundation.org: nuke unneeded ifdefs]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f20786ff
    • Peter Zijlstra's avatar
      lockdep: reduce the ifdeffery · 8e18257d
      Peter Zijlstra authored
      
      Move code around to get fewer but larger #ifdef sections.  Break some
      in-function #ifdefs out into their own functions.
      
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e18257d
    • Peter Zijlstra's avatar
      lockdep: sanitise CONFIG_PROVE_LOCKING · ca58abcb
      Peter Zijlstra authored
      
      Ensure that all of the lock dependency tracking code is under
      CONFIG_PROVE_LOCKING.  This allows us to use the held lock tracking code for
      other purposes.
      
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarJason Baron <jbaron@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca58abcb
  2. Jul 17, 2007
    • Tejun Heo's avatar
      kallsyms: make KSYM_NAME_LEN include space for trailing '\0' · 9281acea
      Tejun Heo authored
      
      KSYM_NAME_LEN is peculiar in that it does not include the space for the
      trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
      buffer.  This is nonsense and error-prone.  Moreover, when the caller
      forgets that it's very likely to subtly bite back by corrupting the stack
      because the last position of the buffer is always cleared to zero.
      
      This patch increments KSYM_NAME_LEN by one and updates code accordingly.
      
      * off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
        is fixed.
      
      * Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
        MODULE_NAME_LEN was treated as if it didn't include space for the
        trailing '\0'.  Fix it.
      
      Signed-off-by: default avatarTejun Heo <htejun@gmail.com>
      Acked-by: default avatarPaulo Marques <pmarques@grupopie.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9281acea
  3. May 08, 2007
  4. Mar 22, 2007
  5. Mar 01, 2007
    • Sam Ravnborg's avatar
      [PATCH] fix section mismatch warning in lockdep · 1499993c
      Sam Ravnborg authored
      
      lockdep_init() is marked __init but used in several places
      outside __init code. This causes following warnings:
      $ scripts/mod/modpost kernel/lockdep.o
      WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.lockdep_init_map after 'lockdep_init_map' (at offset 0x105)
      WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.lockdep_reset_lock after 'lockdep_reset_lock' (at offset 0x35)
      WARNING: kernel/built-in.o - Section mismatch: reference to .init.text:lockdep_init from .text.__lock_acquire after '__lock_acquire' (at offset 0xb2)
      
      The warnings are less obviously due to heavy inlining by gcc - this is not
      altered.
      
      Fix the section mismatch warnings by removing the __init marking, which
      seems obviously wrong.
      
      Signed-off-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1499993c
  6. Feb 20, 2007
  7. Feb 11, 2007
  8. Dec 30, 2006
  9. Dec 13, 2006
  10. Dec 07, 2006
  11. Dec 06, 2006
  12. Nov 17, 2006
    • Ingo Molnar's avatar
      [PATCH] lockdep: fix static keys in module-allocated percpu areas · 1ff56830
      Ingo Molnar authored
      
      lockdep got confused by certain locks in modules:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
      
       Call Trace:
        [<ffffffff8026f40d>] dump_trace+0xaa/0x3f2
        [<ffffffff8026f78f>] show_trace+0x3a/0x60
        [<ffffffff8026f9d1>] dump_stack+0x15/0x17
        [<ffffffff802abfe8>] __lock_acquire+0x724/0x9bb
        [<ffffffff802ac52b>] lock_acquire+0x4d/0x67
        [<ffffffff80267139>] rt_spin_lock+0x3d/0x41
        [<ffffffff8839ed3f>] :ip_conntrack:__ip_ct_refresh_acct+0x131/0x174
        [<ffffffff883a1334>] :ip_conntrack:udp_packet+0xbf/0xcf
        [<ffffffff8839f9af>] :ip_conntrack:ip_conntrack_in+0x394/0x4a7
        [<ffffffff8023551f>] nf_iterate+0x41/0x7f
        [<ffffffff8025946a>] nf_hook_slow+0x64/0xd5
        [<ffffffff802369a2>] ip_rcv+0x24e/0x506
        [...]
      
      Steven Rostedt found the bug: static_obj() check did not take
      PERCPU_ENOUGH_ROOM into account, so in-module DEFINE_PER_CPU-area locks
      were triggering this message.
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      1ff56830
  13. Oct 17, 2006
    • Ingo Molnar's avatar
      [PATCH] lockdep: increase max allowed recursion depth · ca268c69
      Ingo Molnar authored
      
      In general, lockdep warnings are intended to be non-fatal, so I have put in
      various practical limits on internal data structure failure modes.  We haven't
      had a /single/ lockdep-internal crash ever since lockdep went upstream [the
      unwinder crashes are outside of lockdep], and that's largely due to the good
      internal checks it does.
      
      Recursion within the dependency graph is currently limited to 20, that's
      probably not enough on some many-CPU boxes - this patch doubles it to 40.  I
      have written the lockdep functions to have as small stackframes as possible,
      so 40 should be OK too.  (The practical recursion limit should be somewhere
      between 100 and 200 entries.  If we hit that then I'll change the algorithm to
      be iteration-based.  Graph walking logic is so easy to program via recursion,
      so i'd like to keep recursion as long as possible.)
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ca268c69
  14. Oct 11, 2006
  15. Oct 10, 2006
  16. Oct 02, 2006
    • Serge E. Hallyn's avatar
      [PATCH] namespaces: utsname: use init_utsname when appropriate · 96b644bd
      Serge E. Hallyn authored
      
      In some places, particularly drivers and __init code, the init utsns is the
      appropriate one to use.  This patch replaces those with a the init_utsname
      helper.
      
      Changes: Removed several uses of init_utsname().  Hope I picked all the
      	right ones in net/ipv4/ipconfig.c.  These are now changed to
      	utsname() (the per-process namespace utsname) in the previous
      	patch (2/7)
      
      [akpm@osdl.org: CIFS fix]
      Signed-off-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Herbert Poetzl <herbert@13thfloor.at>
      Cc: Andrey Savochkin <saw@sw.ru>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      96b644bd
  17. Sep 29, 2006
    • Ingo Molnar's avatar
      [PATCH] lockdep core: improve the lock-chain-hash · 03cbc358
      Ingo Molnar authored
      
      With CONFIG_DEBUG_LOCK_ALLOC turned off i was getting sporadic failures in
      the locking self-test:
      
        ------------>
        | Locking API testsuite:
        ----------------------------------------------------------------------------
                                         | spin |wlock |rlock |mutex | wsem | rsem |
          --------------------------------------------------------------------------
                             A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                         A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                     A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                     A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 A-B-B-C-C-D-D-A deadlock:  ok  |FAILED|  ok  |  ok  |  ok  |  ok  |
                 A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
                 A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |FAILED|
      
      after much debugging it turned out to be caused by accidental chain-hash
      key collisions.  The current hash is:
      
       #define iterate_chain_key(key1, key2) \
      	(((key1) << MAX_LOCKDEP_KEYS_BITS/2) ^ \
      	((key1) >> (64-MAX_LOCKDEP_KEYS_BITS/2)) ^ \
       	(key2))
      
      where MAX_LOCKDEP_KEYS_BITS is 11.  This hash is pretty good as it will
      shift by 5 bits in every iteration, where every new ID 'mixed' into the
      hash would have up to 11 bits.  But because there was a 6 bits overlap
      between subsequent IDs and their high bits tended to be similar, there was
      a chance for accidental chain-hash collision for a low number of locks
      held.
      
      the solution is to shift by 11 bits:
      
       #define iterate_chain_key(key1, key2) \
      	(((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
      	((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
       	(key2))
      
      This keeps the hash perfect up to 5 locks held, but even above that the
      hash is still good because 11 bits is a relative prime to the total 64
      bits, so a complete match will only occur after 64 held locks (which doesnt
      happen in Linux).  Even after 5 locks held, entropy of the 5 IDs mixed into
      the hash is already good enough so that overlap doesnt generate a colliding
      hash ID.
      
      with this change the false positives went away.
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      03cbc358
    • Dave Jones's avatar
      [PATCH] lockdep: print kernel version · 99de055a
      Dave Jones authored
      
      Lets do the same thing we do for oopses - print out the version in the
      report.  It's an extra line of output though.  We could tack it on the end
      of the INFO: lines, but that screws up Ingo's pretty output.
      
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      99de055a
  18. Sep 26, 2006
  19. Jul 10, 2006
Loading