• Quentin Casasnovas's avatar
    KVM: nVMX: VMX instructions: fix segment checks when L1 is in long mode. · ff30ef40
    Quentin Casasnovas authored
    I couldn't get Xen to boot a L2 HVM when it was nested under KVM - it was
    getting a GP(0) on a rather unspecial vmread from Xen:
    
         (XEN) ----[ Xen-4.7.0-rc  x86_64  debug=n  Not tainted ]----
         (XEN) CPU:    1
         (XEN) RIP:    e008:[<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
         (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d1v0)
         (XEN) rax: ffff82d0801e6288   rbx: ffff83003ffbfb7c   rcx: fffffffffffab928
         (XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: ffff83000bdd0000
         (XEN) rbp: ffff83000bdd0000   rsp: ffff83003ffbfab0   r8:  ffff830038813910
         (XEN) r9:  ffff83003faf3958   r10: 0000000a3b9f7640   r11: ffff83003f82d418
         (XEN) r12: 0000000000000000   r13: ffff83003ffbffff   r14: 0000000000004802
         (XEN) r15: 0000000000000008   cr0: 0000000080050033   cr4: 00000000001526e0
         (XEN) cr3: 000000003fc79000   cr2: 0000000000000000
         (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
         (XEN) Xen code around <ffff82d0801e629e> (vmx_get_segment_register+0x14e/0x450):
         (XEN)  00 00 41 be 02 48 00 00 <44> 0f 78 74 24 08 0f 86 38 56 00 00 b8 08 68 00
         (XEN) Xen stack trace from rsp=ffff83003ffbfab0:
    
         ...
    
         (XEN) Xen call trace:
         (XEN)    [<ffff82d0801e629e>] vmx_get_segment_register+0x14e/0x450
         (XEN)    [<ffff82d0801f3695>] get_page_from_gfn_p2m+0x165/0x300
         (XEN)    [<ffff82d0801bfe32>] hvmemul_get_seg_reg+0x52/0x60
         (XEN)    [<ffff82d0801bfe93>] hvm_emulate_prepare+0x53/0x70
         (XEN)    [<ffff82d0801ccacb>] handle_mmio+0x2b/0xd0
         (XEN)    [<ffff82d0801be591>] emulate.c#_hvm_emulate_one+0x111/0x2c0
         (XEN)    [<ffff82d0801cd6a4>] handle_hvm_io_completion+0x274/0x2a0
         (XEN)    [<ffff82d0801f334a>] __get_gfn_type_access+0xfa/0x270
         (XEN)    [<ffff82d08012f3bb>] timer.c#add_entry+0x4b/0xb0
         (XEN)    [<ffff82d08012f80c>] timer.c#remove_entry+0x7c/0x90
         (XEN)    [<ffff82d0801c8433>] hvm_do_resume+0x23/0x140
         (XEN)    [<ffff82d0801e4fe7>] vmx_do_resume+0xa7/0x140
         (XEN)    [<ffff82d080164aeb>] context_switch+0x13b/0xe40
         (XEN)    [<ffff82d080128e6e>] schedule.c#schedule+0x22e/0x570
         (XEN)    [<ffff82d08012c0cc>] softirq.c#__do_softirq+0x5c/0x90
         (XEN)    [<ffff82d0801602c5>] domain.c#idle_loop+0x25/0x50
         (XEN)
         (XEN)
         (XEN) ****************************************
         (XEN) Panic on CPU 1:
         (XEN) GENERAL PROTECTION FAULT
         (XEN) [error_code=0000]
         (XEN) ****************************************
    
    Tracing my host KVM showed it was the one injecting the GP(0) when
    emulating the VMREAD and checking the destination segment permissions in
    get_vmx_mem_address():
    
         3)               |    vmx_handle_exit() {
         3)               |      handle_vmread() {
         3)               |        nested_vmx_check_permission() {
         3)               |          vmx_get_segment() {
         3)   0.074 us    |            vmx_read_guest_seg_base();
         3)   0.065 us    |            vmx_read_guest_seg_selector();
         3)   0.066 us    |            vmx_read_guest_seg_ar();
         3)   1.636 us    |          }
         3)   0.058 us    |          vmx_get_rflags();
         3)   0.062 us    |          vmx_read_guest_seg_ar();
         3)   3.469 us    |        }
         3)               |        vmx_get_cs_db_l_bits() {
         3)   0.058 us    |          vmx_read_guest_seg_ar();
         3)   0.662 us    |        }
         3)               |        get_vmx_mem_address() {
         3)   0.068 us    |          vmx_cache_reg();
         3)               |          vmx_get_segment() {
         3)   0.074 us    |            vmx_read_guest_seg_base();
         3)   0.068 us    |            vmx_read_guest_seg_selector();
         3)   0.071 us    |            vmx_read_guest_seg_ar();
         3)   1.756 us    |          }
         3)               |          kvm_queue_exception_e() {
         3)   0.066 us    |            kvm_multiple_exception();
         3)   0.684 us    |          }
         3)   4.085 us    |        }
         3)   9.833 us    |      }
         3) + 10.366 us   |    }
    
    Cross-checking the KVM/VMX VMREAD emulation code with the Intel Software
    Developper Manual Volume 3C - "VMREAD - Read Field from Virtual-Machine
    Control Structure", I found that we're enforcing that the destination
    operand is NOT located in a read-only data segment or any code segment when
    the L1 is in long mode - BUT that check should only happen when it is in
    protected mode.
    
    Shuffling the code a bit to make our emulation follow the specification
    allows me to boot a Xen dom0 in a nested KVM and start HVM L2 guests
    without problems.
    
    Fixes: f9eb4af6 ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
    Signed-off-by: default avatarQuentin Casasnovas <quentin.casasnovas@oracle.com>
    Cc: Eugene Korenevsky <ekorenevsky@gmail.com>
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Cc: Radim Krčmář <rkrcmar@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: linux-stable <stable@vger.kernel.org>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    ff30ef40
Name
Last commit
Last update
..
Kconfig Loading commit data...
Makefile Loading commit data...
assigned-dev.c Loading commit data...
assigned-dev.h Loading commit data...
cpuid.c Loading commit data...
cpuid.h Loading commit data...
emulate.c Loading commit data...
hyperv.c Loading commit data...
hyperv.h Loading commit data...
i8254.c Loading commit data...
i8254.h Loading commit data...
i8259.c Loading commit data...
ioapic.c Loading commit data...
ioapic.h Loading commit data...
iommu.c Loading commit data...
irq.c Loading commit data...
irq.h Loading commit data...
irq_comm.c Loading commit data...
kvm_cache_regs.h Loading commit data...
lapic.c Loading commit data...
lapic.h Loading commit data...
mmu.c Loading commit data...
mmu.h Loading commit data...
mmu_audit.c Loading commit data...
mmutrace.h Loading commit data...
mtrr.c Loading commit data...
page_track.c Loading commit data...
paging_tmpl.h Loading commit data...
pmu.c Loading commit data...
pmu.h Loading commit data...
pmu_amd.c Loading commit data...
pmu_intel.c Loading commit data...
svm.c Loading commit data...
trace.h Loading commit data...
tss.h Loading commit data...
vmx.c Loading commit data...
x86.c Loading commit data...
x86.h Loading commit data...