1. 16 Jan, 2009 1 commit
  2. 14 Jan, 2009 1 commit
  3. 31 Dec, 2008 2 commits
  4. 03 Dec, 2008 1 commit
  5. 16 Oct, 2008 1 commit
    • Andi Kleen's avatar
      Make the taint flags reliable · 25ddbb18
      Andi Kleen authored
      It's somewhat unlikely that it happens, but right now a race window
      between interrupts or machine checks or oopses could corrupt the tainted
      bitmap because it is modified in a non atomic fashion.
      
      Convert the taint variable to an unsigned long and use only atomic bit
      operations on it.
      
      Unfortunately this means the intvec sysctl functions cannot be used on it
      anymore.
      
      It turned out the taint sysctl handler could actually be simplified a bit
      (since it only increases capabilities) so this patch actually removes
      code.
      
      [akpm@linux-foundation.org: remove unneeded include]
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25ddbb18
  6. 10 Sep, 2008 1 commit
  7. 02 Sep, 2008 1 commit
  8. 29 Aug, 2008 1 commit
    • Andi Kleen's avatar
      Don't trigger softlockup detector on network fs blocked tasks · 316d9679
      Andi Kleen authored
      Pulling the ethernet cable on a 2.6.27-rc system with NFS mounts
      currently leads to an ongoing flood of soft lockup detector backtraces
      for all tasks blocked on the NFS mounts when the hickup takes
      longer than 120s.
      
      I don't think NFS problems should be all that noisy.
      
      Luckily there's a reasonably easy way to distingush this case.
      
      Don't report task softlockup warnings for tasks in TASK_KILLABLE
      state, which is used by the network file systems.
      
      I believe this patch is a 2.6.27 candidate.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      316d9679
  9. 26 Jul, 2008 1 commit
  10. 05 Jul, 2008 1 commit
  11. 01 Jul, 2008 1 commit
    • Johannes Weiner's avatar
      softlockup: fix watchdog task wakeup frequency · 3e2f69fd
      Johannes Weiner authored
      The print_timestamp can never be bigger than the touch_timestamp, at
      maximum it can be equal.  And if it is, the second check for
      touch_timestamp + 1 bigger print_timestamp is always true, too.
      
      The check for equality is sufficient as we proceed in one-second-steps
      and are at least one second away from the last print-out if we have
      another timestamp.
      Signed-off-by: default avatarJohannes Weiner <hannes@saeurebad.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3e2f69fd
  12. 30 Jun, 2008 1 commit
  13. 25 Jun, 2008 1 commit
  14. 19 Jun, 2008 1 commit
    • Jason Wessel's avatar
      softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression · 9c106c11
      Jason Wessel authored
      The touch_nmi_watchdog() routine on x86 ultimately calls
      touch_softlockup_watchdog().  The problem is that to touch the
      softlockup watchdog, the cpu_clock code has to be called which could
      involve multiple cpu locks and can lead to a hard hang if one of the
      locks is held by a processor that is not going to return anytime soon
      (such as could be the case with kgdb or perhaps even with some other
      kind of exception).
      
      This patch causes the public version of the
      touch_softlockup_watchdog() to defer the cpu clock access to a later
      point.
      
      The test case for this problem is to use the following kernel config
      options:
      
      CONFIG_KGDB_TESTS=y
      CONFIG_KGDB_TESTS_ON_BOOT=y
      CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"
      
      It should be noted that kgdb test suite and these options were not
      available until 2.6.26-rc2, so it was necessary to patch the kgdb
      test suite during the bisection.
      
      I would consider this patch a regression fix because the problem first
      appeared in commit 27ec4407 when some
      logic was added to try to periodically sync the clocks.  It was
      possible to work around this particular problem by simply not
      performing the sync anytime the system was in a critical context.
      This was ok until commit 3e51f33f,
      which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
      multi-cpu locks to sync the clocks.  It became clear that accessing
      this code from an nmi was the source of the lockups.  Avoiding the
      access to the low level clock code from an code inside the NMI
      processing also fixed the problem with the 27ec44... commit.
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9c106c11
  15. 18 Jun, 2008 1 commit
  16. 02 Jun, 2008 1 commit
    • Jason Wessel's avatar
      softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression · 8c2238ea
      Jason Wessel authored
      The touch_nmi_watchdog() routine on x86 ultimately calls
      touch_softlockup_watchdog().  The problem is that to touch the
      softlockup watchdog, the cpu_clock code has to be called which could
      involve multiple cpu locks and can lead to a hard hang if one of the
      locks is held by a processor that is not going to return anytime soon
      (such as could be the case with kgdb or perhaps even with some other
      kind of exception).
      
      This patch causes the public version of the
      touch_softlockup_watchdog() to defer the cpu clock access to a later
      point.
      
      The test case for this problem is to use the following kernel config
      options:
      
      CONFIG_KGDB_TESTS=y
      CONFIG_KGDB_TESTS_ON_BOOT=y
      CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"
      
      It should be noted that kgdb test suite and these options were not
      available until 2.6.26-rc2, so it was necessary to patch the kgdb
      test suite during the bisection.
      
      I would consider this patch a regression fix because the problem first
      appeared in commit 27ec4407 when some
      logic was added to try to periodically sync the clocks.  It was
      possible to work around this particular problem by simply not
      performing the sync anytime the system was in a critical context.
      This was ok until commit 3e51f33f,
      which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
      multi-cpu locks to sync the clocks.  It became clear that accessing
      this code from an nmi was the source of the lockups.  Avoiding the
      access to the low level clock code from an code inside the NMI
      processing also fixed the problem with the 27ec44... commit.
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8c2238ea
  17. 24 May, 2008 2 commits
  18. 29 Feb, 2008 1 commit
  19. 01 Feb, 2008 1 commit
  20. 25 Jan, 2008 2 commits
    • Ingo Molnar's avatar
      softlockup: fix signedness · 90739081
      Ingo Molnar authored
      fix softlockup tunables signedness.
      
      mark tunables read-mostly.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      90739081
    • Ingo Molnar's avatar
      softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks · 82a1fcb9
      Ingo Molnar authored
      this patch extends the soft-lockup detector to automatically
      detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
      printed the following way:
      
       ------------------>
       INFO: task prctl:3042 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
       prctl         D fd5e3793     0  3042   2997
              f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
              f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
              f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
       Call Trace:
        [<c04883a5>] schedule_timeout+0x6d/0x8b
        [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17
        [<c0133a76>] msleep+0x10/0x16
        [<c0138974>] sys_prctl+0x30/0x1e2
        [<c0104c52>] sysenter_past_esp+0x5f/0xa5
        =======================
       2 locks held by prctl/3042:
       #0:  (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a
       #1:  (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9
       <------------------
      
      the current default timeout is 120 seconds. Such messages are printed
      up to 10 times per bootup. If the system has crashed already then the
      messages are not printed.
      
      if lockdep is enabled then all held locks are printed as well.
      
      this feature is a natural extension to the softlockup-detector (kernel
      locked up without scheduling) and to the NMI watchdog (kernel locked up
      with IRQs disabled).
      
      [ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ]
      [ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      82a1fcb9
  21. 19 Oct, 2007 1 commit
  22. 17 Oct, 2007 5 commits
    • Ravikiran G Thirumalai's avatar
      softlockup: add a /proc tuning parameter · c4f3b63f
      Ravikiran G Thirumalai authored
      Control the trigger limit for softlockup warnings.  This is useful for
      debugging softlockups, by lowering the softlockup_thresh to identify
      possible softlockups earlier.
      
      This patch:
      1. Adds a sysctl softlockup_thresh with valid values of 1-60s
         (Higher value to disable false positives)
      2. Changes the softlockup printk to print the cpu softlockup time
      
      [akpm@linux-foundation.org: Fix various warnings and add definition of "two"]
      Signed-off-by: default avatarRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarShai Fultheim <shai@scalex86.org>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c4f3b63f
    • Ingo Molnar's avatar
      softlockup watchdog: style cleanups · a5f2ce3c
      Ingo Molnar authored
      kernel/softirq.c grew a few style uncleanlinesses in the past few
      months, clean that up. No functional changes:
      
         text    data     bss     dec     hex filename
         1126      76       4    1206     4b6 softlockup.o.before
         1129      76       4    1209     4b9 softlockup.o.after
      
      ( the 3 bytes .text increase is due to the "<1>" appended to one of
        the printk messages. )
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5f2ce3c
    • Ingo Molnar's avatar
      softlockup: improve debug output · 43581a10
      Ingo Molnar authored
      Improve the debuggability of kernel lockups by enhancing the debug
      output of the softlockup detector: print the task that causes the lockup
      and try to print a more intelligent backtrace.
      
      The old format was:
      
        BUG: soft lockup detected on CPU#1!
         [<c0105e4a>] show_trace_log_lvl+0x19/0x2e
         [<c0105f43>] show_trace+0x12/0x14
         [<c0105f59>] dump_stack+0x14/0x16
         [<c015f6bc>] softlockup_tick+0xbe/0xd0
         [<c013457d>] run_local_timers+0x12/0x14
         [<c01346b8>] update_process_times+0x3e/0x63
         [<c0145fb8>] tick_sched_timer+0x7c/0xc0
         [<c0140a75>] hrtimer_interrupt+0x135/0x1ba
         [<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
         [<c0105aa3>] apic_timer_interrupt+0x33/0x38
         [<c0104f8a>] syscall_call+0x7/0xb
         =======================
      
      The new format is:
      
        BUG: soft lockup detected on CPU#1! [prctl:2363]
      
        Pid: 2363, comm:                prctl
        EIP: 0060:[<c013915f>] CPU: 1
        EIP is at sys_prctl+0x24/0x18c
         EFLAGS: 00000213    Not tainted  (2.6.22-cfs-v20 #26)
        EAX: 00000001 EBX: 000003e7 ECX: 00000001 EDX: f6df0000
        ESI: 000003e7 EDI: 000003e7 EBP: f6df0fb0 DS: 007b ES: 007b FS: 00d8
        CR0: 8005003b CR2: 4d8c3340 CR3: 3731d000 CR4: 000006d0
         [<c0105e4a>] show_trace_log_lvl+0x19/0x2e
         [<c0105f43>] show_trace+0x12/0x14
         [<c01040be>] show_regs+0x1ab/0x1b3
         [<c015f807>] softlockup_tick+0xef/0x108
         [<c013457d>] run_local_timers+0x12/0x14
         [<c01346b8>] update_process_times+0x3e/0x63
         [<c0145fcc>] tick_sched_timer+0x7c/0xc0
         [<c0140a89>] hrtimer_interrupt+0x135/0x1ba
         [<c011bde7>] smp_apic_timer_interrupt+0x6e/0x80
         [<c0105aa3>] apic_timer_interrupt+0x33/0x38
         [<c0104f8a>] syscall_call+0x7/0xb
         =======================
      
      Note that in the old format we only knew that some system call locked
      up, we didnt know _which_. With the new format we know that it's at a
      specific place in sys_prctl(). [which was where i created an artificial
      kernel lockup to test the new format.]
      
      This is also useful if the lockup happens in user-space - the user-space
      EIP (and other registers) will be printed too. (such a lockup would
      either suggest that the task was running at SCHED_FIFO:99 and looping
      for more than 10 seconds, or that the softlockup detector has a
      false-positive.)
      
      The task name is printed too first, just in case we dont manage to print
      a useful backtrace.
      
      [satyam@infradead.org: fix warning]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarSatyam Sharma <satyam@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43581a10
    • Ingo Molnar's avatar
      fix the softlockup watchdog to actually work · a115d5ca
      Ingo Molnar authored
      this Xen related commit:
      
         commit 966812dc
         Author: Jeremy Fitzhardinge <jeremy@goop.org>
         Date:   Tue May 8 00:28:02 2007 -0700
      
             Ignore stolen time in the softlockup watchdog
      
      broke the softlockup watchdog to never report any lockups. (!)
      
      print_timestamp defaults to 0, this makes the following condition
      always true:
      
      	if (print_timestamp < (touch_timestamp + 1) ||
      
      and we'll in essence never report soft lockups.
      
      apparently the functionality of the soft lockup watchdog was never
      actually tested with that patch applied ...
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a115d5ca
    • Ingo Molnar's avatar
      softlockup: use cpu_clock() instead of sched_clock() · a3b13c23
      Ingo Molnar authored
      sched_clock() is not a reliable time-source, use cpu_clock() instead.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3b13c23
  23. 17 Jul, 2007 1 commit
    • Rafael J. Wysocki's avatar
      Freezer: make kernel threads nonfreezable by default · 83144186
      Rafael J. Wysocki authored
      Currently, the freezer treats all tasks as freezable, except for the kernel
      threads that explicitly set the PF_NOFREEZE flag for themselves.  This
      approach is problematic, since it requires every kernel thread to either
      set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
      care for the freezing of tasks at all.
      
      It seems better to only require the kernel threads that want to or need to
      be frozen to use some freezer-related code and to remove any
      freezer-related code from the other (nonfreezable) kernel threads, which is
      done in this patch.
      
      The patch causes all kernel threads to be nonfreezable by default (ie.  to
      have PF_NOFREEZE set by default) and introduces the set_freezable()
      function that should be called by the freezable kernel threads in order to
      unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
      threads call set_freezable(), so it shouldn't cause any (intentional)
      change of behaviour to appear.  Additionally, it updates documentation to
      describe the freezing of tasks more accurately.
      
      [akpm@linux-foundation.org: build fixes]
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: default avatarNigel Cunningham <nigel@nigel.suspend2.net>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83144186
  24. 09 May, 2007 1 commit
    • Rafael J. Wysocki's avatar
      Add suspend-related notifications for CPU hotplug · 8bb78442
      Rafael J. Wysocki authored
      Since nonboot CPUs are now disabled after tasks and devices have been
      frozen and the CPU hotplug infrastructure is used for this purpose, we need
      special CPU hotplug notifications that will help the CPU-hotplug-aware
      subsystems distinguish normal CPU hotplug events from CPU hotplug events
      related to a system-wide suspend or resume operation in progress.  This
      patch introduces such notifications and causes them to be used during
      suspend and resume transitions.  It also changes all of the
      CPU-hotplug-aware subsystems to take these notifications into consideration
      (for now they are handled in the same way as the corresponding "normal"
      ones).
      
      [oleg@tv-sign.ru: cleanups]
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Cc: Gautham R Shenoy <ego@in.ibm.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8bb78442
  25. 08 May, 2007 3 commits
    • Jeremy Fitzhardinge's avatar
      add touch_all_softlockup_watchdogs() · 04c9167f
      Jeremy Fitzhardinge authored
      Add touch_all_softlockup_watchdogs() to allow the softlockup watchdog
      timers on all cpus to be updated.  This is used to prevent sysrq-t from
      generating a spurious watchdog message when generating lots of output.
      
      Softlockup watchdogs use sched_clock() as its timebase, which is inherently
      per-cpu (at least, when it is measuring unstolen time).  Because of this,
      it isn't possible for one CPU to directly update the other CPU's timers,
      but it is possible to tell the other CPUs to do update themselves
      appropriately.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Acked-by: default avatarChris Lalancette <clalance@redhat.com>
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Rick Lindsley <ricklind@us.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04c9167f
    • Jeremy Fitzhardinge's avatar
      Ignore stolen time in the softlockup watchdog · 966812dc
      Jeremy Fitzhardinge authored
      The softlockup watchdog is currently a nuisance in a virtual machine, since
      the whole system could have the CPU stolen from it for a long period of
      time.  While it would be unlikely for a guest domain to be denied timer
      interrupts for over 10s, it could happen and any softlockup message would
      be completely spurious.
      
      Earlier I proposed that sched_clock() return time in unstolen nanoseconds,
      which is how Xen and VMI currently implement it.  If the softlockup
      watchdog uses sched_clock() to measure time, it would automatically ignore
      stolen time, and therefore only report when the guest itself locked up.
      When running native, sched_clock() returns real-time nanoseconds, so the
      behaviour would be unchanged.
      
      Note that sched_clock() used this way is inherently per-cpu, so this patch
      makes sure that the per-processor watchdog thread initialized its own
      timestamp.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: john stultz <johnstul@us.ibm.com>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Dan Hecht <dhecht@vmware.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Chris Lalancette <clalance@redhat.com>
      Cc: Rick Lindsley <ricklind@us.ibm.com>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      966812dc
    • Oleg Nesterov's avatar
      softlockup: s/99/MAX_RT_PRIO/ · 02fb6149
      Oleg Nesterov authored
      Don't use hardcoded 99 value, use MAX_RT_PRIO.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      02fb6149
  26. 29 Sep, 2006 1 commit
  27. 31 Jul, 2006 1 commit
  28. 27 Jun, 2006 2 commits
  29. 25 Jun, 2006 2 commits