1. 25 Dec, 2008 15 commits
  2. 24 Dec, 2008 1 commit
    • Ingo Molnar's avatar
      x86: disable X86_PTRACE_BTS · 40f15ad8
      Ingo Molnar authored
      there's a new ptrace arch level feature in .28:
        config X86_PTRACE_BTS
        bool "Branch Trace Store"
      it has broken fork() handling: the old DS area gets copied over into
      a new task without clearing it.
      Fixes exist but they came too late:
        c5dee617: x86, bts: memory accounting
      : x86, bts: add fork and exit handling
      and are queued up for v2.6.29. This shows that the facility is still not
      tested well enough to release into a stable kernel - disable it for now and
      reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
      facilities too - hopefully resulting in better code.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  3. 23 Dec, 2008 1 commit
  4. 22 Dec, 2008 2 commits
  5. 20 Dec, 2008 1 commit
    • Dmitry Adamushko's avatar
      x86: fix resume (S2R) broken by Intel microcode module, on A110L · 280a9ca5
      Dmitry Adamushko authored
      Impact: fix deadlock
      This is in response to the following bug report:
      Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12100
      Subject         : resume (S2R) broken by Intel microcode module, on A110L
      Submitter       : Andreas Mohr <andi@lisas.de>
      Date            : 2008-11-25 08:48 (19 days old)
      Handled-By      : Dmitry Adamushko <dmitry.adamushko@gmail.com>
      [ The deadlock scenario has been discovered by Andreas Mohr ]
      I think I might have a logical explanation why the system:
      might hang upon resuming, OTOH it should have likely hanged each and every time.
      (1) possible deadlock in microcode_resume_cpu() if either 'if' section is
      (2) now, I don't see it in spec. and can't experimentally verify it (newer
      ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
      think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
      be back to its original one (we need to reload ucode anyway so it doesn't seem
      logical if a cpu doesn't drop the version)... if so, the comparison with
      memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
      the aforementioned 'if' sections might have been triggered - leading to a
      Obviously, in my tests I simulated loading/resuming with the ucode of the same
      version (just to see that the file is loaded/re-loaded upon resuming) so this
      issue has never popped up.
      I'd appreciate if someone with an appropriate system might give a try to the
      2nd patch (titled "fix a comparison && deadlock...").
      In any case, the deadlock situation is a must-have fix.
      Reported-by: default avatarAndreas Mohr <andi@lisas.de>
      Signed-off-by: default avatarDmitry Adamushko <dmitry.adamushko@gmail.com>
      Tested-by: default avatarAndreas Mohr <andi@lisas.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  6. 18 Dec, 2008 3 commits
  7. 17 Dec, 2008 5 commits
    • Guillaume Knispel's avatar
      powerpc: Fix corruption error in rh_alloc_fixed() · af4d3643
      Guillaume Knispel authored
      There is an error in rh_alloc_fixed() of the Remote Heap code:
      If there is at least one free block blk won't be NULL at the end of the
      search loop, so -ENOMEM won't be returned and the else branch of
      "if (bs == s || be == e)" will be taken, corrupting the management
      Signed-off-by: default avatarGuillaume Knispel <gknispel@proformatique.com>
      Acked-by: default avatarTimur Tabi <timur@freescale.com>
      Signed-off-by: default avatarKumar Gala <galak@kernel.crashing.org>
    • Dave Liu's avatar
      powerpc/fsl-booke: Fix the miss interrupt restore · 28707af0
      Dave Liu authored
      The commit e5e774d8
      powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted
      introduce one issue. that casue the problem like this:
      Kernel BUG at c00b19fc [verbose debug info unavailable]
      Oops: Exception in kernel mode, sig: 5 [#1]
      MPC8572 DS
      Modules linked in:
      NIP: c00b19fc LR: c00b1c34 CTR: c0064e88
      REGS: ef02b7b0 TRAP: 0700   Not tainted  (2.6.28-rc8-00057-g1bda7128
      MSR: 00021000 <ME>  CR: 44048028  XER: 20000000
      TASK = ef02c000[1] 'init' THREAD: ef02a000
      GPR00: 00000001 ef02b860 ef02c000 eec201a0 c0dec2c0 00000000 000078a1 00000400
      GPR08: c00b4e40 000078a1 c048ec00 a1780000 44048028 ecd26917 00000001 ef02b948
      GPR16: ffffffea 0000020c 00000000 00000000 00000003 0000000a 00000000 000078a1
      GPR24: eec201a0 00000000 ed849000 00000400 ef02b95c 00000001 ef02b978 ef02b984
      NIP [c00b19fc] __find_get_block+0x24/0x238
      LR [c00b1c34] __getblk+0x24/0x2a0
      Call Trace:
      [ef02b860] [c017b768] generic_make_request+0x290/0x328 (unreliable)
      [ef02b8b0] [c00b1c34] __getblk+0x24/0x2a0
      [ef02b910] [c00b4ae4] __bread+0x14/0xf8
      [ef02b920] [c00fc228] ext2_get_branch+0xf0/0x138
      [ef02b940] [c00fcc88] ext2_get_block+0xb8/0x828
      [ef02ba00] [c00bbdc8] do_mpage_readpage+0x188/0x808
      [ef02bac0] [c00bc5b4] mpage_readpages+0xec/0x144
      [ef02bb50] [c00fba38] ext2_readpages+0x24/0x34
      [ef02bb60] [c006ade0] __do_page_cache_readahead+0x150/0x230
      [ef02bbb0] [c0064bdc] filemap_fault+0x31c/0x3e0
      [ef02bbf0] [c00728b8] __do_fault+0x60/0x5b0
      [ef02bc50] [c0011e0c] do_page_fault+0x2d8/0x4c4
      [ef02bd10] [c000ed90] handle_page_fault+0xc/0x80
      [ef02bdd0] [c00c7adc] set_brk+0x74/0x9c
      [ef02bdf0] [c00c9274] load_elf_binary+0x70c/0x1180
      [ef02be70] [c00945f0] search_binary_handler+0xa8/0x274
      [ef02bea0] [c0095818] do_execve+0x19c/0x1d4
      [ef02bed0] [c000766c] sys_execve+0x58/0x84
      [ef02bef0] [c000e950] ret_from_syscall+0x0/0x3c
      [ef02bfb0] [c009c6fc] sys_dup+0x24/0x6c
      [ef02bfc0] [c0001e04] init_post+0xb0/0xf0
      [ef02bfd0] [c046c1ac] kernel_init+0xcc/0xf4
      [ef02bff0] [c000e6d0] kernel_thread+0x4c/0x68
      Instruction dump:
      4bffffa4 813f000c 4bffffac 9421ffb0 7c0802a6 7d800026 90010054 bf210034
      91810030 7c0000a6 68008000 54008ffe <0f000000> 3d20c04e 3b29ffb8 38000008
      The issue was the beqlr returns early but we haven't reenabled interrupts.
      Signed-off-by: default avatarDave Liu <daveliu@freescale.com>
      Signed-off-by: default avatarKumar Gala <galak@kernel.crashing.org>
    • Joerg Roedel's avatar
      AMD IOMMU: panic if completion wait loop fails · 84df8175
      Joerg Roedel authored
      Impact: prevents data corruption after a failed completion wait loop
      Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
    • Joerg Roedel's avatar
      AMD IOMMU: set cmd buffer pointers to zero manually · cf558d25
      Joerg Roedel authored
      Impact: set cmd buffer head and tail pointers to zero in case nobody else did
      Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
    • Hans-Christian Egtvedt's avatar
      avr32: remove .note.gnu.build-id section when making vmlinux.bin · c1892cb8
      Hans-Christian Egtvedt authored
      This patch will remove the section .note.gnu.build-id added in binutils
      2.18 from the vmlinux.bin binary. Not removing this section results in a
      huge multiple gigabyte binary and likewize large uImage.
      Signed-off-by: default avatarHans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
      Signed-off-by: default avatarHaavard Skinnemoen <haavard.skinnemoen@atmel.com>
  8. 16 Dec, 2008 4 commits
  9. 15 Dec, 2008 4 commits
    • Arnd Bergmann's avatar
      powerpc/cell/axon-msi: Fix MSI after kexec · 23e0e8af
      Arnd Bergmann authored
      Commit d015fe99
       'powerpc/cell/axon-msi: Retry on missing interrupt'
      has turned a rare failure to kexec on QS22 into a reproducible
      error, which we have now analysed.
      The problem is that after a kexec, the MSIC hardware still points
      into the middle of the old ring buffer.  We set up the ring buffer
      during reboot, but not the offset into it.  On older kernels, this
      would cause a storm of thousands of spurious interrupts after a
      kexec, which would most of the time get dropped silently.
      With the new code, we time out on each interrupt, waiting for
      it to become valid.  If more interrupts come in that we time
      out on, this goes on indefinitely, which eventually leads to
      a hard crash.
      The solution in this commit is to read the current offset from
      the MSIC when reinitializing it.  This now works correctly, as
      Reported-by: default avatarDirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    • Dave Hansen's avatar
      powerpc: Fix bootmem reservation on uninitialized node · a4c74ddd
      Dave Hansen authored
      careful_allocation() was calling into the bootmem allocator for
      nodes which had not been fully initialized and caused a previous
      bug:  http://patchwork.ozlabs.org/patch/10528/
        So, I merged a
      few broken out loops in do_init_bootmem() to fix it.  That changed
      the code ordering.
      I think this bug is triggered by having reserved areas for a node
      which are spanned by another node's contents.  In the
      mark_reserved_regions_for_nid() code, we attempt to reserve the
      area for a node before we have allocated the NODE_DATA() for that
      nid.  We do this since I reordered that loop.  I suck.
      This is causing crashes at bootup on some systems, as reported
      by Jon Tollefson.
      This may only present on some systems that have 16GB pages
      reserved.  But, it can probably happen on any system that is
      trying to reserve large swaths of memory that happen to span other
      nodes' contents.
      This commit ensures that we do not touch bootmem for any node which
      has not been initialized, and also removes a compile warning about
      an unused variable.
      Signed-off-by: default avatarDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    • Brian King's avatar
      powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area · 48f797de
      Brian King authored
      It looks like most of the hugetlb code is doing the correct thing if
      hugepages are not supported, but the mmap code is not.  If we get into
      the mmap code when hugepages are not supported, such as in an LPAR
      which is running Active Memory Sharing, we can oops the kernel.  This
      fixes the oops being seen in this path.
      oops: Kernel access of bad area, sig: 11 [#1]
      SMP NR_CPUS=1024 NUMA pSeries
      Modules linked in: nfs(N) lockd(N) nfs_acl(N) sunrpc(N) ipv6(N) fuse(N) loop(N)
      dm_mod(N) sg(N) ibmveth(N) sd_mod(N) crc_t10dif(N) ibmvscsic(N)
      scsi_transport_srp(N) scsi_tgt(N) scsi_mod(N)
      Supported: No
      NIP: c000000000038d60 LR: c00000000003945c CTR: c0000000000393f0
      REGS: c000000077e7b830 TRAP: 0300   Tainted: G
      MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 44000448  XER: 20000001
      DAR: c000002000af90a8, DSISR: 0000000040000000
      TASK = c00000007c1b8600[4019] 'hugemmap01' THREAD: c000000077e78000 CPU: 6
      GPR00: 0000001fffffffe0 c000000077e7bab0 c0000000009a4e78 0000000000000000
      GPR04: 0000000000010000 0000000000000001 00000000ffffffff 0000000000000001
      GPR08: 0000000000000000 c000000000af90c8 0000000000000001 0000000000000000
      GPR12: 000000000000003f c000000000a73880 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000010000
      GPR20: 0000000000000000 0000000000000003 0000000000010000 0000000000000001
      GPR24: 0000000000000003 0000000000000000 0000000000000001 ffffffffffffffb5
      GPR28: c000000077ca2e80 0000000000000000 c00000000092af78 0000000000010000
      NIP [c000000000038d60] .slice_get_unmapped_area+0x6c/0x4e0
      LR [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
      Call Trace:
      [c000000077e7bbc0] [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
      [c000000077e7bc30] [c000000000107e30] .get_unmapped_area+0x64/0xd8
      [c000000077e7bcb0] [c00000000010b140] .do_mmap_pgoff+0x140/0x420
      [c000000077e7bd80] [c00000000000bf5c] .sys_mmap+0xc4/0x140
      [c000000077e7be30] [c0000000000086b4] syscall_exit+0x0/0x40
      Instruction dump:
      fac1ffb0 fae1ffb8 fb01ffc0 fb21ffc8 fb41ffd0 fb61ffd8 fb81ffe0 fbc1fff0
      fbe1fff8 f821fef1 f8c10158 f8e10160 <7d49002e> f9010168 e92d01b0 eb4902b0
      Signed-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    • Russell King's avatar
      [ARM] Ensure linux/hardirqs.h is included where required · 67306da6
      Russell King authored
      ... for the removal of it from asm-generic/local.h
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
  10. 14 Dec, 2008 3 commits
  11. 13 Dec, 2008 1 commit
    • Kumar Gala's avatar
      powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted · e5e774d8
      Kumar Gala authored
      An example calling sequence which we did see:
      copy_user_highpage -> kmap_atomic -> flush_tlb_page -> _tlbil_va
      We got interrupted after setting up the MAS registers before the
      tlbwe and the interrupt handler that caused the interrupt also did
      a kmap_atomic (ide code) and thus on returning from the interrupt
      the MAS registers no longer contained the proper values.
      Since we dont save/restore MAS registers for normal interrupts we
      need to disable interrupts in _tlbil_va to ensure atomicity.
      Signed-off-by: default avatarKumar Gala <galak@kernel.crashing.org>