Skip to content
  • Joerg Roedel's avatar
    x86-32: Separate 1:1 pagetables from swapper_pg_dir · fd89a137
    Joerg Roedel authored
    
    
    This patch fixes machine crashes which occur when heavily exercising the
    CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
    AMD Erratum 383 and result in a fatal machine check exception. Here's
    the scenario:
    
    1. On 32-bit, the swapper_pg_dir page table is used as the initial page
    table for booting a secondary CPU.
    
    2. To make this work, swapper_pg_dir needs a direct mapping of physical
    memory in it (the low mappings). By adding those low, large page (2M)
    mappings (PAE kernel), we create the necessary conditions for Erratum
    383 to occur.
    
    3. Other CPUs which do not participate in the off- and onlining game may
    use swapper_pg_dir while the low mappings are present (when leave_mm is
    called). For all steps below, the CPU referred to is a CPU that is using
    swapper_pg_dir, and not the CPU which is being onlined.
    
    4. The presence of the low mappings in swapper_pg_dir can result
    in TLB entries for addresses below __PAGE_OFFSET to be established
    speculatively. These TLB entries are marked global and large.
    
    5. When the CPU with such TLB entry switches to another page table, this
    TLB entry remains because it is global.
    
    6. The process then generates an access to an address covered by the
    above TLB entry but there is a permission mismatch - the TLB entry
    covers a large global page not accessible to userspace.
    
    7. Due to this permission mismatch a new 4kb, user TLB entry gets
    established. Further, Erratum 383 provides for a small window of time
    where both TLB entries are present. This results in an uncorrectable
    machine check exception signalling a TLB multimatch which panics the
    machine.
    
    There are two ways to fix this issue:
    
            1. Always do a global TLB flush when a new cr3 is loaded and the
            old page table was swapper_pg_dir. I consider this a hack hard
            to understand and with performance implications
    
            2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
            does.
    
    This patch implements solution 2. It introduces a trampoline_pg_dir
    which has the same layout as swapper_pg_dir with low_mappings. This page
    table is used as the initial page table of the booting CPU. Later in the
    bringup process, it switches to swapper_pg_dir and does a global TLB
    flush. This fixes the crashes in our test cases.
    
    -v2: switch to swapper_pg_dir right after entering start_secondary() so
    that we are able to access percpu data which might not be mapped in the
    trampoline page table.
    
    Signed-off-by: default avatarJoerg Roedel <joerg.roedel@amd.com>
    LKML-Reference: <20100816123833.GB28147@aftab>
    Signed-off-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
    Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
    fd89a137