Skip to content
  • Alex Williamson's avatar
    vfio/type1: DMA unmap chunking · 6fe1010d
    Alex Williamson authored
    
    
    When unmapping DMA entries we try to rely on the IOMMU API behavior
    that allows the IOMMU to unmap a larger area than requested, up to
    the size of the original mapping.  This works great when the IOMMU
    supports superpages *and* they're in use.  Otherwise, each PAGE_SIZE
    increment is unmapped separately, resulting in poor performance.
    
    Instead we can use the IOVA-to-physical-address translation provided
    by the IOMMU API and unmap using the largest contiguous physical
    memory chunk available, which is also how vfio/type1 would have
    mapped the region.  For a synthetic 1TB guest VM mapping and shutdown
    test on Intel VT-d (2M IOMMU pagesize support), this achieves about
    a 30% overall improvement mapping standard 4K pages, regardless of
    IOMMU superpage enabling, and about a 40% improvement mapping 2M
    hugetlbfs pages when IOMMU superpages are not available.  Hugetlbfs
    with IOMMU superpages enabled is effectively unchanged.
    
    Unfortunately the same algorithm does not work well on IOMMUs with
    fine-grained superpages, like AMD-Vi, costing about 25% extra since
    the IOMMU will automatically unmap any power-of-two contiguous
    mapping we've provided it.  We add a routine and a domain flag to
    detect this feature, leaving AMD-Vi unaffected by this unmap
    optimization.
    
    Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
    6fe1010d