Skip to content
  • Michael Tokarev's avatar
    fix crash in migration, 32-bit userspace on 64-bit host · 51b0c606
    Michael Tokarev authored
    
    
    This change fixes a long-standing immediate crash (memory corruption
    and abort in glibc malloc code) in migration on 32bits.
    
    The bug is present since this commit:
    
      commit 692d9aca97b865b0f7903565274a52606910f129
      Author: Bruce Rogers <brogers@novell.com>
      Date:   Wed Sep 23 16:13:18 2009 -0600
    
        qemu-kvm: allocate correct size for dirty bitmap
    
        The dirty bitmap copied out to userspace is stored in a long array,
        and gets copied out to userspace accordingly.  This patch accounts
        for that correctly.  Currently I'm seeing kvm crashing due to writing
        beyond the end of the alloc'd dirty bitmap memory, because the buffer
        has the wrong size.
    
        Signed-off-by: Bruce Rogers
    Signed-off-by: default avatarMarcelo Tosatti <mtosatti@redhat.com>
    
     --- a/qemu-kvm.c
     +++ b/qemu-kvm.c
     @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned long phys_addr,
     -            buf = qemu_malloc((slots[i].len / 4096 + 7) / 8 + 2);
     +            buf = qemu_malloc(BITMAP_SIZE(slots[i].len));
                 r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf);
    
    BITMAP_SIZE is now open-coded in that function, like this:
    
     size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), HOST_LONG_BITS) / 8;
    
    The problem is that HOST_LONG_BITS in 32bit userspace is 32
    but it's 64 in 64bit kernel.  So userspace aligns this to
    32, and kernel to 64, but since no length is passed from
    userspace to kernel on ioctl, kernel uses its size calculation
    and copies 4 extra bytes to userspace, corrupting memory.
    
    Here's how it looks like during migrate execution:
    
    our=20, kern=24
    our=4, kern=8
    ...
    our=4, kern=8
    our=4064, kern=4064
    our=512, kern=512
    our=4, kern=8
    our=20, kern=24
    our=4, kern=8
    ...
    our=4, kern=8
    our=4064, kern=4064
    *** glibc detected *** ./x86_64-softmmu/qemu-system-x86_64: realloc(): invalid next size: 0x08f20528 ***
    
    (our is userspace size above, kern is the size as calculated
    by the kernel).
    
    Fix this by always aligning to 64 in a hope that no platform will
    have sizeof(long)>8 any time soon, and add a comment describing it
    all.  It's a small price to pay for bad kernel design.
    
    Alternatively it's possible to fix that in the kernel by using
    different size calculation depending on the current process.
    But this becomes quite ugly.
    
    Special thanks goes to Stefan Hajnoczi for spotting the fundamental
    cause of the issue, and to Alexander Graf for his support in #qemu.
    
    Signed-off-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
    CC: Bruce Rogers <brogers@novell.com>
    Signed-off-by: default avatarAvi Kivity <avi@redhat.com>
    51b0c606