1. 02 Nov, 2014 3 commits
    • Paolo Bonzini's avatar
      KVM: vmx: defer load of APIC access page address during reset · a73896cb
      Paolo Bonzini authored
      Most call paths to vmx_vcpu_reset do not hold the SRCU lock.  Defer loading
      the APIC access page to the next vmentry.
      
      This avoids the following lockdep splat:
      
      [ INFO: suspicious RCU usage. ]
      3.18.0-rc2-test2+ #70 Not tainted
      -------------------------------
      include/linux/kvm_host.h:474 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 0
      1 lock held by qemu-system-x86/2371:
       #0:  (&vcpu->mutex){+.+...}, at: [<ffffffffa037d800>] vcpu_load+0x20/0xd0 [kvm]
      
      stack backtrace:
      CPU: 4 PID: 2371 Comm: qemu-system-x86 Not tainted 3.18.0-rc2-test2+ #70
      Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A12 01/10/2013
       0000000000000001 ffff880209983ca8 ffffffff816f514f 0000000000000000
       ffff8802099b8990 ffff880209983cd8 ffffffff810bd687 00000000000fee00
       ffff880208a2c000 ffff880208a10000 ffff88020ef50040 ffff880209983d08
      Call Trace:
       [<ffffffff816f514f>] dump_stack+0x4e/0x71
       [<ffffffff810bd687>] lockdep_rcu_suspicious+0xe7/0x120
       [<ffffffffa037d055>] gfn_to_memslot+0xd5/0xe0 [kvm]
       [<ffffffffa03807d3>] __gfn_to_pfn+0x33/0x60 [kvm]
       [<ffffffffa0380885>] gfn_to_page+0x25/0x90 [kvm]
       [<ffffffffa038aeec>] kvm_vcpu_reload_apic_access_page+0x3c/0x80 [kvm]
       [<ffffffffa08f0a9c>] vmx_vcpu_reset+0x20c/0x460 [kvm_intel]
       [<ffffffffa039ab8e>] kvm_vcpu_reset+0x15e/0x1b0 [kvm]
       [<ffffffffa039ac0c>] kvm_arch_vcpu_setup+0x2c/0x50 [kvm]
       [<ffffffffa037f7e0>] kvm_vm_ioctl+0x1d0/0x780 [kvm]
       [<ffffffff810bc664>] ? __lock_is_held+0x54/0x80
       [<ffffffff812231f0>] do_vfs_ioctl+0x300/0x520
       [<ffffffff8122ee45>] ? __fget+0x5/0x250
       [<ffffffff8122f0fa>] ? __fget_light+0x2a/0xe0
       [<ffffffff81223491>] SyS_ioctl+0x81/0xa0
       [<ffffffff816fed6d>] system_call_fastpath+0x16/0x1b
      Reported-by: default avatarTakashi Iwai <tiwai@suse.de>
      Reported-by: default avatarAlexei Starovoitov <alexei.starovoitov@gmail.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Tested-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Fixes: 38b99173Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a73896cb
    • Jan Kiszka's avatar
      KVM: nVMX: Disable preemption while reading from shadow VMCS · 282da870
      Jan Kiszka authored
      In order to access the shadow VMCS, we need to load it. At this point,
      vmx->loaded_vmcs->vmcs and the actually loaded one start to differ. If
      we now get preempted by Linux, vmx_vcpu_put and, on return, the
      vmx_vcpu_load will work against the wrong vmcs. That can cause
      copy_shadow_to_vmcs12 to corrupt the vmcs12 state.
      
      Fix the issue by disabling preemption during the copy operation.
      copy_vmcs12_to_shadow is safe from this issue as it is executed by
      vmx_vcpu_run when preemption is already disabled before vmentry.
      
      This bug is exposed by running Jailhouse within KVM on CPUs with
      shadow VMCS support.  Jailhouse never expects an interrupt pending
      vmexit, but the bug can cause it if, after copy_shadow_to_vmcs12
      is preempted, the active VMCS happens to have the virtual interrupt
      pending flag set in the CPU-based execution controls.
      Signed-off-by: default avatarJan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      282da870
    • Nadav Amit's avatar
      KVM: x86: Fix far-jump to non-canonical check · 7e46dddd
      Nadav Amit authored
      Commit d1442d85 ("KVM: x86: Handle errors when RIP is set during far
      jumps") introduced a bug that caused the fix to be incomplete.  Due to
      incorrect evaluation, far jump to segment with L bit cleared (i.e., 32-bit
      segment) and RIP with any of the high bits set (i.e, RIP[63:32] != 0) set may
      not trigger #GP.  As we know, this imposes a security problem.
      
      In addition, the condition for two warnings was incorrect.
      
      Fixes: d1442d85Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      [Add #ifdef CONFIG_X86_64 to avoid complaints of undefined behavior. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7e46dddd
  2. 31 Oct, 2014 2 commits
    • Andy Lutomirski's avatar
      x86_64, entry: Fix out of bounds read on sysenter · 653bc77a
      Andy Lutomirski authored
      Rusty noticed a Really Bad Bug (tm) in my NT fix.  The entry code
      reads out of bounds, causing the NT fix to be unreliable.  But, and
      this is much, much worse, if your stack is somehow just below the
      top of the direct map (or a hole), you read out of bounds and crash.
      
      Excerpt from the crash:
      
      [    1.129513] RSP: 0018:ffff88001da4bf88  EFLAGS: 00010296
      
        2b:*    f7 84 24 90 00 00 00     testl  $0x4000,0x90(%rsp)
      
      That read is deterministically above the top of the stack.  I
      thought I even single-stepped through this code when I wrote it to
      check the offset, but I clearly screwed it up.
      
      Fixes: 8c7aa698 ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")
      Reported-by: default avatarRusty Russell <rusty@ozlabs.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      653bc77a
    • Tony Lindgren's avatar
      net: smc91x: Fix gpios for device tree based booting · 7d2911c4
      Tony Lindgren authored
      With legacy booting, the platform init code was taking care of
      the configuring of GPIOs. With device tree based booting, things
      may or may not work depending what bootloader has configured or
      if the legacy platform code gets called.
      
      Let's add support for the pwrdn and reset GPIOs to the smc91x
      driver to fix the issues of smc91x not working properly when
      booted in device tree mode.
      
      And let's change n900 to use these settings as some versions
      of the bootloader do not configure things properly causing
      errors.
      Reported-by: default avatarKevin Hilman <khilman@linaro.org>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d2911c4
  3. 29 Oct, 2014 12 commits
  4. 28 Oct, 2014 11 commits
  5. 27 Oct, 2014 4 commits
  6. 25 Oct, 2014 2 commits
  7. 24 Oct, 2014 6 commits
    • Eric Paris's avatar
      i386/audit: stop scribbling on the stack frame · 26c2d2b3
      Eric Paris authored
      git commit b4f0d375 was very very dumb.
      It was writing over %esp/pt_regs semi-randomly on i686  with the expected
      "system can't boot" results.  As noted in:
      
      https://bugs.freedesktop.org/show_bug.cgi?id=85277
      
      This patch stops fscking with pt_regs.  Instead it sets up the registers
      for the call to __audit_syscall_entry in the most obvious conceivable
      way.  It then does just a tiny tiny touch of magic.  We need to get what
      started in PT_EDX into 0(%esp) and PT_ESI into 4(%esp).  This is as easy
      as a pair of pushes.
      
      After the call to __audit_syscall_entry all we need to do is get that
      now useless junk off the stack (pair of pops) and reload %eax with the
      original syscall so other stuff can keep going about it's business.
      Reported-by: default avatarPaulo Zanoni <przanoni@gmail.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Link: http://lkml.kernel.org/r/1414037043-30647-1-git-send-email-eparis@redhat.com
      Cc: Richard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      26c2d2b3
    • Catalin Marinas's avatar
      arm64: Fix memblock current_limit with 64K pages and 48-bit VA · 3dec0fe4
      Catalin Marinas authored
      With 48-bit VA space, the 64K page configuration uses 3 levels instead
      of 2 and PUD_SIZE != PMD_SIZE. Since with 64K pages we only cover
      PMD_SIZE with the initial swapper_pg_dir populated in head.S, the
      memblock current_limit needs to be set accordingly in map_mem() to avoid
      allocating unmapped memory. The memblock current_limit is progressively
      increased as more blocks are mapped.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      3dec0fe4
    • David S. Miller's avatar
      sparc64: Implement __get_user_pages_fast(). · 06090e8e
      David S. Miller authored
      It is not sufficient to only implement get_user_pages_fast(), you
      must also implement the atomic version __get_user_pages_fast()
      otherwise you end up using the weak symbol fallback implementation
      which simply returns zero.
      
      This is dangerous, because it causes the futex code to loop forever
      if transparent hugepages are supported (see get_futex_key()).
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06090e8e
    • David S. Miller's avatar
      sparc64: Fix register corruption in top-most kernel stack frame during boot. · ef3e035c
      David S. Miller authored
      Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
      eventually narrowed this down to only impacting machines using
      UltraSPARC-III and derivitive cpus.
      
      The crash happens right when the first user process is spawned:
      
      [   54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      [   54.451346]
      [   54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab7 #96
      [   54.666431] Call Trace:
      [   54.698453]  [0000000000762f8c] panic+0xb0/0x224
      [   54.759071]  [000000000045cf68] do_exit+0x948/0x960
      [   54.823123]  [000000000042cbc0] fault_in_user_windows+0xe0/0x100
      [   54.902036]  [0000000000404ad0] __handle_user_windows+0x0/0x10
      [   54.978662] Press Stop-A (L1-A) to return to the boot prom
      [   55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
      
      Further investigation showed that compiling only per_cpu_patch() with
      an older compiler fixes the boot.
      
      Detailed analysis showed that the function is not being miscompiled by
      gcc-4.9, but it is using a different register allocation ordering.
      
      With the gcc-4.9 compiled function, something during the code patching
      causes some of the %i* input registers to get corrupted.  Perhaps
      we have a TLB miss path into the firmware that is deep enough to
      cause a register window spill and subsequent restore when we get
      back from the TLB miss trap.
      
      Let's plug this up by doing two things:
      
      1) Stop using the firmware stack for client interface calls into
         the firmware.  Just use the kernel's stack.
      
      2) As soon as we can, call into a new function "start_early_boot()"
         to put a one-register-window buffer between the firmware's
         deepest stack frame and the top-most initial kernel one.
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef3e035c
    • Arun Chandran's avatar
      arm64: ASLR: Don't randomise text when randomise_va_space == 0 · 92980405
      Arun Chandran authored
      When user asks to turn off ASLR by writing "0" to
      /proc/sys/kernel/randomize_va_space there should not be
      any randomization to mmap base, stack, VDSO, libs, text and heap
      
      Currently arm64 violates this behavior by randomising text.
      Fix this by defining a constant ELF_ET_DYN_BASE. The randomisation of
      mm->mmap_base is done by setup_new_exec -> arch_pick_mmap_layout ->
      mmap_base -> mmap_rnd.
      Signed-off-by: default avatarArun Chandran <achandran@mvista.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      92980405
    • Ralf Baechle's avatar
      MIPS: SEAD3: Fix I2C device registration. · 4846f118
      Ralf Baechle authored
      This isn't a module and shouldn't be one.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      4846f118