    • Siddha, Suresh B's avatar
      x86_64 irq: use mask/unmask and proper locking in fixup_irqs() · 48d8d7ee
      Siddha, Suresh B authored
      Force irq migration path during cpu offline, is not using proper locks and
      irq_chip mask/unmask routines.  This will result in some races(especially
      the device generating the interrupt can see some inconsistent state,
      resulting in issues like stuck irq,..).
      Appended patch fixes the issue by taking proper lock and encapsulating
      irq_chip set_affinity() with a mask() before and an unmask() after.
      This fixes a MSI irq stuck issue reported by Darrick Wong.
      There are several more general bugs in this area(irq migration in the
      process context). For example,
       1. Possibility of missing edge triggered irq.
       2. Reliable method of migrating level triggered irq in the process context.
      We plan to look and close these in the near future.
      Eric says:
      	In addition even with the fix from Suresh there is still at least one
      	nasty hardware race in fixup_irqs().   However we exercise that code
      	path rarely enough that we are unlikely to hit it in the real world,
      	and that race seems to have existed since the code was merged.  And a
      	fix for that is not coming soon as it is an open investigation area
      	if we can fix irq migration to work outside of irq context or if
      	we have to rework the requirements imposed by the generic cpu hotplug
      	and layer on fixup_irqs().  So this may come up again.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Reported-and-tested-by: default avatarDarrick Wong <djwong@us.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Acked-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Suresh Siddha's avatar
      x86_64: set the irq_chip name for lapic · c47e285d
      Suresh Siddha authored
      set the irq_chip name for lapic.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Joshua Wise's avatar
      x86_64: fix misplaced `continue' in mce.c · 4f84e4be
      Joshua Wise authored
        When a userspace application wants to know about machine check events, it
        opens /dev/mcelog and does a read(). Usually, we found that this interface
        works well, but in some cases, when the system was taking large numbers of
        machine check exceptions, the read() would hang. The system would output a
        soft-lockup warning, and the daemon reading from /dev/mcelog would suck up
        as much of a single CPU as it could spinning in system space.
        This patch fixes this bug. In particular, there was a "continue" inside a
        timeout loop that presumably was intended to break out of the outer loop,
        but instead caused the inner loop to continue. This patch also makes the
        condition for the break-out a little more evident by changing a
        !time_before to a time_after_eq.
        The read() no longer hangs in this test case.
        On my system, I could replicate the bug with the following command:
          # for i in `seq 15000`; do ./inject_sbe.sh; done
        where inject_sbe.sh contains commands to inject a single-bit error into the
        next memory write transaction.
        This patch is against git f1518a08
      Signed-off-by: default avatarJoshua Wise <jwise@google.com>
      Signed-off-by: default avatarTim Hockin <thockin@google.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
