1. 14 Apr, 2015 3 commits
    • Ulrich Obergfell's avatar
      watchdog: enable the new user interface of the watchdog mechanism · 195daf66
      Ulrich Obergfell authored
      With the current user interface of the watchdog mechanism it is only
      possible to disable or enable both lockup detectors at the same time.
      This series introduces new kernel parameters and changes the semantics of
      some existing kernel parameters, so that the hard lockup detector and the
      soft lockup detector can be disabled or enabled individually.  With this
      series applied, the user interface is as follows.
      
      - parameters in /proc/sys/kernel
      
        . soft_watchdog
          This is a new parameter to control and examine the run state of
          the soft lockup detector.
      
        . nmi_watchdog
          The semantics of this parameter have changed. It can now be used
          to control and examine the run state of the hard lockup detector.
      
        . watchdog
          This parameter is still available to control the run state of both
          lockup detectors at the same time. If this parameter is examined,
          it shows the logical OR of soft_watchdog and nmi_watchdog.
      
        . watchdog_thresh
          The semantics of this parameter are not affected by the patch.
      
      - kernel command line parameters
      
        . nosoftlockup
          The semantics of this parameter have changed. It can now be used
          to disable the soft lockup detector at boot time.
      
        . nmi_watchdog=0 or nmi_watchdog=1
          Disable or enable the hard lockup detector at boot time. The patch
          introduces '=1' as a new option.
      
        . nowatchdog
          The semantics of this parameter are not affected by the patch. It
          is still available to disable both lockup detectors at boot time.
      
      Also, remove the proc_dowatchdog() function which is no longer needed.
      
      [dzickus@redhat.com: wrote changelog]
      [dzickus@redhat.com: update documentation for kernel params and sysctl]
      Signed-off-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      195daf66
    • Ulrich Obergfell's avatar
      watchdog: introduce separate handlers for parameters in /proc/sys/kernel · 83a80a39
      Ulrich Obergfell authored
      Separate handlers for each watchdog parameter in /proc/sys/kernel replace
      the proc_dowatchdog() function.  Three of those handlers merely call
      proc_watchdog_common() with one different argument.
      Signed-off-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83a80a39
    • Ulrich Obergfell's avatar
      watchdog: new definitions and variables, initialization · 84d56e66
      Ulrich Obergfell authored
      The hardlockup and softockup had always been tied together.  Due to the
      request of KVM folks, they had a need to have one enabled but not the
      other.  Internally rework the code to split things apart more cleanly.
      
      There is a bunch of churn here, but the end result should be code that
      should be easier to maintain and fix without knowing the internals of what
      is going on.
      
      This patch (of 9):
      
      Introduce new definitions and variables to separate the user interface in
      /proc/sys/kernel from the internal run state of the lockup detectors.  The
      internal run state is represented by two bits in a new variable that is
      named 'watchdog_enabled'.  This helps simplify the code, for example:
      
      - In order to check if any of the two lockup detectors is enabled,
        it is sufficient to check if 'watchdog_enabled' is not zero.
      
      - In order to enable/disable one or both lockup detectors,
        it is sufficient to set/clear one or both bits in 'watchdog_enabled'.
      
      - Concurrent updates of 'watchdog_enabled' need not be synchronized via
        a spinlock or a mutex. Updates can either be atomic or concurrency can
        be detected by using 'cmpxchg'.
      Signed-off-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      84d56e66
  2. 13 Oct, 2014 1 commit
    • Ulrich Obergfell's avatar
      kernel/watchdog.c: control hard lockup detection default · 6e7458a6
      Ulrich Obergfell authored
      In some cases we don't want hard lockup detection enabled by default.
      An example is when running as a guest.  Introduce
      
        watchdog_enable_hardlockup_detector(bool)
      
      allowing those cases to disable hard lockup detection.  This must be
      executed early by the boot processor from e.g.  smp_prepare_boot_cpu, in
      order to allow kernel command line arguments to override it, as well as
      to avoid hard lockup detection being enabled before we've had a chance
      to indicate that it's unwanted.  In summary,
      
        initial boot:					default=enabled
        smp_prepare_boot_cpu
          watchdog_enable_hardlockup_detector(false):	default=disabled
        cmdline has 'nmi_watchdog=1':			default=enabled
      
      The running kernel still has the ability to enable/disable at any time
      with /proc/sys/kernel/nmi_watchdog us usual.  However even when the
      default has been overridden /proc/sys/kernel/nmi_watchdog will initially
      show '1'.  To truly turn it on one must disable/enable it, i.e.
      
        echo 0 > /proc/sys/kernel/nmi_watchdog
        echo 1 > /proc/sys/kernel/nmi_watchdog
      
      This patch will be immediately useful for KVM with the next patch of this
      series.  Other hypervisor guest types may find it useful as well.
      
      [akpm@linux-foundation.org: fix build]
      [dzickus@redhat.com: fix compile issues on sparc]
      Signed-off-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarAndrew Jones <drjones@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e7458a6
  3. 22 Jul, 2014 1 commit
    • Tomasz Nowicki's avatar
      acpi, apei, ghes: Make NMI error notification to be GHES architecture extension. · 44a69f61
      Tomasz Nowicki authored
      Currently APEI depends on x86 architecture. It is because of NMI hardware
      error notification of GHES which is currently supported by x86 only.
      However, many other APEI features can be still used perfectly by other
      architectures.
      
      This commit adds two symbols:
      1. HAVE_ACPI_APEI for those archs which support APEI.
      2. HAVE_ACPI_APEI_NMI which is used for NMI code isolation in ghes.c
         file. NMI related data and functions are grouped so they can be wrapped
         inside one #ifdef section. Appropriate function stubs are provided for
         !NMI case.
      
      Note there is no functional changes for x86 due to hard selected
      HAVE_ACPI_APEI and HAVE_ACPI_APEI_NMI symbols.
      Signed-off-by: default avatarTomasz Nowicki <tomasz.nowicki@linaro.org>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      44a69f61
  4. 23 Jun, 2014 2 commits
    • Aaron Tomlin's avatar
      kernel/watchdog.c: print traces for all cpus on lockup detection · ed235875
      Aaron Tomlin authored
      A 'softlockup' is defined as a bug that causes the kernel to loop in
      kernel mode for more than a predefined period to time, without giving
      other tasks a chance to run.
      
      Currently, upon detection of this condition by the per-cpu watchdog
      task, debug information (including a stack trace) is sent to the system
      log.
      
      On some occasions, we have observed that the "victim" rather than the
      actual "culprit" (i.e.  the owner/holder of the contended resource) is
      reported to the user.  Often this information has proven to be
      insufficient to assist debugging efforts.
      
      To avoid loss of useful debug information, for architectures which
      support NMI, this patch makes it possible to improve soft lockup
      reporting.  This is accomplished by issuing an NMI to each cpu to obtain
      a stack trace.
      
      If NMI is not supported we just revert back to the old method.  A sysctl
      and boot-time parameter is available to toggle this feature.
      
      [dzickus@redhat.com: add CONFIG_SMP in certain areas]
      [akpm@linux-foundation.org: additional CONFIG_SMP=n optimisations]
      [mq@suse.cz: fix warning]
      Signed-off-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarJan Moskyto Matejka <mq@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed235875
    • Aaron Tomlin's avatar
      nmi: provide the option to issue an NMI back trace to every cpu but current · f3aca3d0
      Aaron Tomlin authored
      Sometimes it is preferred not to use the trigger_all_cpu_backtrace()
      routine when one wants to avoid capturing a back trace for current.  For
      instance if one was previously captured recently.
      
      This patch provides a new routine namely
      trigger_allbutself_cpu_backtrace() which offers the flexibility to issue
      an NMI to every cpu but current and capture a back trace accordingly.
      
      Patch x86 and sparc to support new routine.
      
      [dzickus@redhat.com: add stub in #else clause]
      [dzickus@redhat.com: don't print message in single processor case, wrap with get/put_cpu based on Oleg's suggestion]
      [sfr@canb.auug.org.au: undo C99ism]
      Signed-off-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Mateusz Guzik <mguzik@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f3aca3d0
  5. 20 Jun, 2013 1 commit
    • Frederic Weisbecker's avatar
      watchdog: Rename confusing state variable · 3c00ea82
      Frederic Weisbecker authored
      We have two very conflicting state variable names in the
      watchdog:
      
      * watchdog_enabled: This one reflects the user interface. It's
      set to 1 by default and can be overriden with boot options
      or sysctl/procfs interface.
      
      * watchdog_disabled: This is the internal toggle state that
      tells if watchdog threads, timers and NMI events are currently
      running or not. This state mostly depends on the user settings.
      It's a convenient state latch.
      
      Now we really need to find clearer names because those
      are just too confusing to encourage deep review.
      
      watchdog_enabled now becomes watchdog_user_enabled to reflect
      its purpose as an interface.
      
      watchdog_disabled becomes watchdog_running to suggest its
      role as a pure internal state.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Anish Singh <anish198519851985@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      3c00ea82
  6. 23 Mar, 2012 1 commit
  7. 23 May, 2011 2 commits
  8. 22 Dec, 2010 1 commit
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR · 4a7863cc
      Don Zickus authored
      The x86 arch has shifted its use of the nmi_watchdog from a
      local implementation to the global one provide by
      kernel/watchdog.c.  This shift has caused a whole bunch of
      compile problems under different config options.  I attempt to
      simplify things with the patch below.
      
      In order to simplify things, I had to come to terms with the
      meaning of two terms ARCH_HAS_NMI_WATCHDOG and
      CONFIG_HARDLOCKUP_DETECTOR.  Basically they mean the same thing,
      the former on a local level and the latter on a global level.
      
      With the old x86 nmi watchdog gone, there is no need to rely on
      defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
      make sense any more.  x86 will now use the global
      implementation.
      
      The changes below do a few things.  First it changes the few
      places that relied on ARCH_HAS_NMI_WATCHDOG to use
      CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
      anyway, so nothing unusual here).  Those pieces of code were
      relying more on local apic functionality the nmi watchdog
      functionality, so the change should make sense.
      
      Second, I removed the x86 implementation of
      touch_nmi_watchdog().  It isn't need now, instead x86 will rely
      on kernel/watchdog.c's implementation.
      
      Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
      x86.  And tweaked the include/linux/nmi.h file to tell users to
      look for an externally defined touch_nmi_watchdog in the case of
      ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
      changes removes some of the ugliness in that file.
      
      Finally, I added a Kconfig dependency for
      CONFIG_HARDLOCKUP_DETECTOR that said you can't have
      ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR.  You can
      only have one nmi_watchdog.
      
      Tested with
      ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
      configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
      (various broken configs)
      
      Hopefully, after this patch I won't get any more compile broken
      emails. :-)
      
      v3:
        changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
        prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4a7863cc
  9. 09 Dec, 2010 1 commit
  10. 18 Nov, 2010 2 commits
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog · 072b198a
      Don Zickus authored
      Now that the bulk of the old nmi_watchdog is gone, remove all
      the stub variables and hooks associated with it.
      
      This touches lots of files mainly because of how the io_apic
      nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
      is forever gone, remove all its fingers.
      
      Most of this code was not being exercised by virtue of
      nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
      risky here.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      072b198a
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove the old nmi_watchdog · 5f2b0ba4
      Don Zickus authored
      Now that we have a new nmi_watchdog that is more generic and
      sits on top of the perf subsystem, we really do not need the old
      nmi_watchdog any more.
      
      In addition, the old nmi_watchdog doesn't really work if you are
      using the default clocksource, hpet.  The old nmi_watchdog code
      relied on local apic interrupts to determine if the cpu is still
      alive.  With hpet as the clocksource, these interrupts don't
      increment any more and the old nmi_watchdog triggers false
      postives.
      
      This piece removes the old nmi_watchdog code and stubs out any
      variables and functions calls.  The stubs are the same ones used
      by the new nmi_watchdog code, so it should be well tested.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      5f2b0ba4
  11. 15 May, 2010 1 commit
    • Don Zickus's avatar
      lockup_detector: Cross arch compile fixes · cafcd80d
      Don Zickus authored
      Combining the softlockup and hardlockup code causes watchdog.c
      to build even without the hardlockup detection support.
      
      So if an arch, that has the previous and the new nmi watchdog
      implementations cohabiting, wants to know if the generic one
      is in use, CONFIG_LOCKUP_DETECTOR is not a reliable check.
      We need to use CONFIG_HARDLOCKUP_DETECTOR instead.
      
      Fixes:
      	kernel/built-in.o: In function `touch_nmi_watchdog':
      	(.text+0x449bc): multiple definition of `touch_nmi_watchdog'
      	arch/sparc/kernel/built-in.o:(.text+0x11b28): first defined here
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      LKML-Reference: <20100514151121.GR15159@redhat.com>
      [ use CONFIG_HARDLOCKUP_DETECTOR instead of CONFIG_PERF_EVENTS_NMI]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      cafcd80d
  12. 12 May, 2010 1 commit
    • Don Zickus's avatar
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus authored
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      kernel/watchdog.c.
      
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      
      I tested this on x86_64 and both the softlockup and hardlockup paths
      work.
      
      V2:
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      
      V3:
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      
      V4:
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      
      V5:
      - split apart warn flags for hard and soft lockups
      
      TODO:
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      
      [fweisbec: merged conflict patch]
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      58687acb
  13. 25 Feb, 2010 1 commit
    • Don Zickus's avatar
      nmi_watchdog: Clean up various small details · 47195d57
      Don Zickus authored
      Mostly copy/paste whitespace damage with a couple of nitpicks by
      the checkpatch script. Fix the struct definition as requested by Ingo too.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: peterz@infradead.org
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266880143-24943-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      --
       arch/x86/kernel/apic/hw_nmi.c |   14 +++++------
       arch/x86/kernel/traps.c       |    6 ++--
       include/linux/nmi.h           |    2 -
       kernel/nmi_watchdog.c         |   51 ++++++++++++++++++++----------------------
       4 files changed, 36 insertions(+), 37 deletions(-)
      47195d57
  14. 14 Feb, 2010 1 commit
  15. 08 Feb, 2010 1 commit
    • Don Zickus's avatar
      nmi_watchdog: Config option to enable new nmi_watchdog · 84e478c6
      Don Zickus authored
      These are the bits that enable the new nmi_watchdog and safely
      isolate the old nmi_watchdog.  Only one or the other can run,
      not both at the same time.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      84e478c6
  16. 03 Aug, 2009 1 commit
    • Ingo Molnar's avatar
      debug lockups: Improve lockup detection, fix generic arch fallback · 47cab6a7
      Ingo Molnar authored
      As Andrew noted, my previous patch ("debug lockups: Improve lockup
      detection") broke/removed SysRq-L support from architecture that do
      not provide a __trigger_all_cpu_backtrace implementation.
      
      Restore a fallback path and clean up the SysRq-L machinery a bit:
      
       - Rename the arch method to arch_trigger_all_cpu_backtrace()
      
       - Simplify the define
      
       - Document the method a bit - in the hope of more architectures
         adding support for it.
      
      [ The patch touches Sparc code for the rename. ]
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      47cab6a7
  17. 13 Feb, 2007 1 commit
    • Ingo Molnar's avatar
      [PATCH] x86: fix laptop bootup hang in init_acpi() · 5d0e600d
      Ingo Molnar authored
      During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about
      10%-20% of the time in acpi_init():
      
       Calling initcall 0xc055ce1a: topology_init+0x0/0x2f()
       Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c()
       Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175()
       Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17()
       Calling initcall 0xc0569f99: init_bio+0x0/0xf4()
       Calling initcall 0xc056b865: genhd_device_init+0x0/0x50()
       Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87()
       Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee()
      
      It's a hard hang that not even an NMI could punch through!  Frustratingly,
      adding printks or function tracing to the ACPI code made the hangs go away
      ...
      
      After some time an additional detail emerged: disabling the NMI watchdog
      made these occasional hangs go away.
      
      So i spent the better part of today trying to debug this and trying out
      various theories when i finally found the likely reason for the hang: if
      acpi_ns_initialize_devices() executes an _INI AML method and an NMI
      happens to hit that AML execution in the wrong moment, the machine would
      hang.  (my theory is that this must be some sort of chipset setup method
      doing stores to chipset mmio registers?)
      
      Unfortunately given the characteristics of the hang it was sheer
      impossible to figure out which of the numerous AML methods is impacted
      by this problem.
      
      As a workaround i wrote an interface to disable chipset-based NMIs while
      executing _INI sections - and indeed this fixed the hang.  I did a
      boot-loop of 100 separate reboots and none hung - while without the patch
      it would hang every 5-10 attempts.  Out of caution i did not touch the
      nmi_watchdog=2 case (it's not related to the chipset anyway and didnt
      hang).
      
      I implemented this for both x86_64 and i686, tested the i686 laptop both
      with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and
      tested an Athlon64 box with the 64-bit kernel as well. Everything builds
      and works with the patch applied.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5d0e600d
  18. 06 Dec, 2006 1 commit
  19. 29 Sep, 2006 1 commit
  20. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4