1. 16 Feb, 2007 1 commit
  2. 13 Feb, 2007 3 commits
    • Andi Kleen's avatar
      [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs · 0a4599c8
      Andi Kleen authored
      
      
      For i386/x86-64.
      
      Straight forward -- just reuse the Family 0xf code.
      
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      0a4599c8
    • Ingo Molnar's avatar
      [PATCH] x86: fix laptop bootup hang in init_acpi() · 5d0e600d
      Ingo Molnar authored
      
      
      During kernel bootup, a new T60 laptop (CoreDuo, 32-bit) hangs about
      10%-20% of the time in acpi_init():
      
       Calling initcall 0xc055ce1a: topology_init+0x0/0x2f()
       Calling initcall 0xc055d75e: mtrr_init_finialize+0x0/0x2c()
       Calling initcall 0xc05664f3: param_sysfs_init+0x0/0x175()
       Calling initcall 0xc014cb65: pm_sysrq_init+0x0/0x17()
       Calling initcall 0xc0569f99: init_bio+0x0/0xf4()
       Calling initcall 0xc056b865: genhd_device_init+0x0/0x50()
       Calling initcall 0xc056c4bd: fbmem_init+0x0/0x87()
       Calling initcall 0xc056dd74: acpi_init+0x0/0x1ee()
      
      It's a hard hang that not even an NMI could punch through!  Frustratingly,
      adding printks or function tracing to the ACPI code made the hangs go away
      ...
      
      After some time an additional detail emerged: disabling the NMI watchdog
      made these occasional hangs go away.
      
      So i spent the better part of today trying to debug this and trying out
      various theories when i finally found the likely reason for the hang: if
      acpi_ns_initialize_devices() executes an _INI AML method and an NMI
      happens to hit that AML execution in the wrong moment, the machine would
      hang.  (my theory is that this must be some sort of chipset setup method
      doing stores to chipset mmio registers?)
      
      Unfortunately given the characteristics of the hang it was sheer
      impossible to figure out which of the numerous AML methods is impacted
      by this problem.
      
      As a workaround i wrote an interface to disable chipset-based NMIs while
      executing _INI sections - and indeed this fixed the hang.  I did a
      boot-loop of 100 separate reboots and none hung - while without the patch
      it would hang every 5-10 attempts.  Out of caution i did not touch the
      nmi_watchdog=2 case (it's not related to the chipset anyway and didnt
      hang).
      
      I implemented this for both x86_64 and i686, tested the i686 laptop both
      with nmi_watchdog=1 [which triggered the hangs] and nmi_watchdog=2, and
      tested an Athlon64 box with the 64-bit kernel as well. Everything builds
      and works with the patch applied.
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5d0e600d
    • Venkatesh Pallipadi's avatar
      [PATCH] i386: Handle 32 bit PerfMon Counter writes cleanly in i386 nmi_watchdog · 90ce4bc4
      Venkatesh Pallipadi authored
      
      
      Change i386 nmi handler to handle 32 bit perfmon counter MSR writes cleanly.
      
      Signed-off-by: default avatarVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      90ce4bc4
  3. 23 Jan, 2007 1 commit
  4. 09 Dec, 2006 1 commit
    • Ravikiran G Thirumalai's avatar
      [PATCH] x86: Fix boot hang due to nmi watchdog init code · 92715e28
      Ravikiran G Thirumalai authored
      
      
      2.6.19  stopped booting (or booted based on build/config) on our x86_64
      systems due to a bug introduced in 2.6.19.  check_nmi_watchdog schedules an
      IPI on all cpus to  busy wait on a flag, but fails to set the busywait
      flag if NMI functionality is disabled.  This causes the secondary cpus
      to spin in an endless loop, causing the kernel bootup to hang.
      Depending upon the build, the  busywait flag got overwritten (stack variable)
      and caused  the kernel to bootup on certain builds.  Following patch fixes
      the bug by setting the busywait flag before returning from check_nmi_watchdog.
      I guess using a stack variable is not good here as the calling function could
      potentially return while the busy wait loop is still spinning on the flag.
      
      AK: I redid the patch significantly to be cleaner
      
      Signed-off-by: default avatarRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarShai Fultheim <shai@scalex86.org>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      92715e28
  5. 06 Dec, 2006 2 commits
  6. 21 Oct, 2006 1 commit
  7. 01 Oct, 2006 1 commit
  8. 29 Sep, 2006 1 commit
  9. 26 Sep, 2006 11 commits
  10. 31 Jul, 2006 1 commit
  11. 03 Jul, 2006 1 commit
  12. 26 Jun, 2006 2 commits
  13. 28 Mar, 2006 2 commits
  14. 23 Mar, 2006 2 commits
    • Andrew Morton's avatar
      [PATCH] more for_each_cpu() conversions · 394e3902
      Andrew Morton authored
      
      
      When we stop allocating percpu memory for not-possible CPUs we must not touch
      the percpu data for not-possible CPUs at all.  The correct way of doing this
      is to test cpu_possible() or to use for_each_cpu().
      
      This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
      few instances of this bug, if any.  But the patch converts lots of open-coded
      test to use the preferred helper macros.
      
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Acked-by: default avatarKyle McMartin <kyle@parisc-linux.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Christian Zankel <chris@zankel.net>
      Cc: Philippe Elie <phil.el@wanadoo.fr>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      394e3902
    • Ingo Molnar's avatar
      [PATCH] make bug messages more consistent · 91368d73
      Ingo Molnar authored
      
      
      Consolidate all kernel bug printouts to begin with the "BUG: " string.
      Makes it easier to find them in large bootup logs.
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      91368d73
  15. 08 Mar, 2006 1 commit
    • GOTO Masanori's avatar
      [PATCH] x86: Fix i386 nmi_watchdog that does not trigger die_nmi · b884e257
      GOTO Masanori authored
      
      
      Fix i386 nmi_watchdog that does not meet watchdog timeout condition.  It
      does not hit die_nmi when it should be triggered, because the current
      nmi_watchdog_tick in arch/i386/kernel/nmi.c never count up alert_counter
      like this:
      
      	void nmi_watchdog_tick (struct pt_regs * regs) {
      	if (last_irq_sums[cpu] == sum) {
      		alert_counter[cpu]++;		<- count up alert_counter, but
      		if (alert_counter[cpu] == 5*nmi_hz)
      			die_nmi(regs, "NMI Watchdog detected LOCKUP");
      		alert_counter[cpu] = 0;		<- reset alert_counter
      
      This patch changes it back to the previous and working version.
      
      This was found and originally written by Kohta NAKASHIMA.
      
      (akpm: also uninline write_watchdog_counter(), saving 184 byets)
      
      Signed-off-by: default avatarGOTO Masanori <gotom@sanori.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b884e257
  16. 05 Feb, 2006 1 commit
  17. 30 Oct, 2005 1 commit
  18. 26 Sep, 2005 1 commit
  19. 07 Sep, 2005 1 commit
  20. 05 Sep, 2005 1 commit
  21. 19 Aug, 2005 1 commit
    • Steven Rostedt's avatar
      [PATCH] Mobil Pentium 4 HT and the NMI · cd3716ab
      Steven Rostedt authored
      
      
      I'm trying to get the nmi working with my laptop (IBM ThinkPad G41) and after
      debugging it a while, I found that the nmi code doesn't want to set it up for
      this particular CPU.
      
      Here I have:
      
      $ cat /proc/cpuinfo
      processor       : 0
      vendor_id       : GenuineIntel
      cpu family      : 15
      model           : 4
      model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz
      stepping        : 1
      cpu MHz         : 3320.084
      cache size      : 1024 KB
      physical id     : 0
      siblings        : 2
      core id         : 0
      cpu cores       : 1
      fdiv_bug        : no
      hlt_bug         : no
      f00f_bug        : no
      coma_bug        : no
      fpu             : yes
      fpu_exception   : yes
      cpuid level     : 3
      wp              : yes
      flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
      mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
      monitor ds_cpl est tm2 cid xtpr
      bogomips        : 6642.39
      
      processor       : 1
      vendor_id       : GenuineIntel
      cpu family      : 15
      model           : 4
      model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz
      stepping        : 1
      cpu MHz         : 3320.084
      cache size      : 1024 KB
      physical id     : 0
      siblings        : 2
      core id         : 0
      cpu cores       : 1
      fdiv_bug        : no
      hlt_bug         : no
      f00f_bug        : no
      coma_bug        : no
      fpu             : yes
      fpu_exception   : yes
      cpuid level     : 3
      wp              : yes
      flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
      mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
      monitor ds_cpl est tm2 cid xtpr
      bogomips        : 6637.46
      
      And the following code shows:
      
      $ cat linux-2.6.13-rc6/arch/i386/kernel/nmi.c
      
      [...]
      
      void setup_apic_nmi_watchdog (void)
      {
              switch (boot_cpu_data.x86_vendor) {
              case X86_VENDOR_AMD:
                      if (boot_cpu_data.x86 != 6 && boot_cpu_data.x86 != 15)
                              return;
                      setup_k7_watchdog();
                      break;
              case X86_VENDOR_INTEL:
                       switch (boot_cpu_data.x86) {
                      case 6:
                              if (boot_cpu_data.x86_model > 0xd)
                                      return;
      
                              setup_p6_watchdog();
                              break;
                      case 15:
                              if (boot_cpu_data.x86_model > 0x3)
                                      return;
      
      Here I get boot_cpu_data.x86_model == 0x4.  So I decided to change it and
      reboot.  I now seem to have a working NMI.  So, unless there's something know
      to be bad about this processor and the NMI.  I'm submitting the following
      patch.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarZwane Mwaikambo <zwane@arm.linux.org.uk>
      Acked-by: default avatarMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      cd3716ab
  22. 23 Jun, 2005 1 commit
  23. 01 May, 2005 1 commit
    • Jack F Vogel's avatar
      [PATCH] check nmi watchdog is broken · 67701ae9
      Jack F Vogel authored
      
      
      A bug against an xSeries system showed up recently noting that the
      check_nmi_watchdog() test was failing.
      
      I have been investigating it and discovered in both i386 and x86_64 the
      recent change to the routine to use the cpu_callin_map has uncovered a
      problem.  Prior to that change, on an SMP box, the test was trivally
      passing because all cpu's were found to not yet be online, but now with the
      callin_map they are discovered, it goes on to test the counter and they
      have not yet begun to increment, so it announces a CPU is stuck and bails
      out.
      
      On all the systems I have access to test, the announcement of failure is
      also bougs...  by the time you can login and check /proc/interrupts, the
      NMI count is happily incrementing on all CPUs.  Its just that the test is
      being done too early.
      
      I have tried moving the call to the test around a bit, and it was always
      too early.  I finally hit on this proposed solution, it delays the routine
      via a late_initcall(), seems like the right solution to me.
      
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      67701ae9
  24. 16 Apr, 2005 1 commit