1. 08 Aug, 2016 1 commit
    • Alex Williamson's avatar
      vfio/pci: Fix NULL pointer oops in error interrupt setup handling · c8952a70
      Alex Williamson authored
      There are multiple cases in vfio_pci_set_ctx_trigger_single() where
      we assume we can safely read from our data pointer without actually
      checking whether the user has passed any data via the count field.
      VFIO_IRQ_SET_DATA_NONE in particular is entirely broken since we
      attempt to pull an int32_t file descriptor out before even checking
      the data type.  The other data types assume the data pointer contains
      one element of their type as well.
      
      In part this is good news because we were previously restricted from
      doing much sanitization of parameters because it was missed in the
      past and we didn't want to break existing users.  Clearly DATA_NONE
      is completely broken, so it must not have any users and we can fix
      it up completely.  For DATA_BOOL and DATA_EVENTFD, we'll just
      protect ourselves, returning error when count is zero since we
      previously would have oopsed.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reported-by: default avatarChris Thompson <the_cartographer@hotmail.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      c8952a70
  2. 19 Jul, 2016 9 commits
  3. 14 Jul, 2016 1 commit
  4. 08 Jul, 2016 1 commit
    • Yongji Xie's avatar
      vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive · 05f0c03f
      Yongji Xie authored
      Current vfio-pci implementation disallows to mmap
      sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio
      page may be shared with other BARs. This will cause some
      performance issues when we passthrough a PCI device with
      this kind of BARs. Guest will be not able to handle the mmio
      accesses to the BARs which leads to mmio emulations in host.
      
      However, not all sub-page BARs will share page with other BARs.
      We should allow to mmap the sub-page MMIO BARs which we can
      make sure will not share page with other BARs.
      
      This patch adds support for this case. And we try to add a
      dummy resource to reserve the remainder of the page which
      hot-add device's BAR might be assigned into. But it's not
      necessary to handle the case when the BAR is not page aligned.
      Because we can't expect the BAR will be assigned into the same
      location in a page in guest when we passthrough the BAR. And
      it's hard to access this BAR in userspace because we have
      no way to get the BAR's location in a page.
      Signed-off-by: default avatarYongji Xie <xyjxie@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      05f0c03f
  5. 23 Jun, 2016 1 commit
    • Peng Fan's avatar
      vfio: platform: support No-IOMMU mode · 9698cbf0
      Peng Fan authored
      The vfio No-IOMMU mode was supported by this
      'commit 03a76b60 ("vfio: Include No-IOMMU mode")',
      but it only support vfio-pci.
      
      Using vfio_iommu_group_get/put, but not iommu_group_get/put,
      the platform devices can be exposed to userspace with
      CONFIG_VFIO_NOIOMMU and the "enable_unsafe_noiommu_mode"
      option enabled.
      
      From 'commit 03a76b60 ("vfio: Include No-IOMMU mode")',
      "This should make it very clear that this mode is not safe.
      Additionally, CAP_SYS_RAWIO privileges are necessary to work
      with groups and containers using this mode.  Groups making
      use of this support are named /dev/vfio/noiommu-$GROUP and
      can only make use of the special VFIO_NOIOMMU_IOMMU for the
      container.  Use of this mode, specifically binding a device
      without a native IOMMU group to a VFIO bus driver will taint
      the kernel and should therefore not be considered supported."
      Signed-off-by: default avatarPeng Fan <van.freenix@gmail.com>
      Cc: Eric Auger <eric.auger@linaro.org>
      Cc: Baptiste Reynal <b.reynal@virtualopensystems.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      9698cbf0
  6. 31 May, 2016 1 commit
  7. 30 May, 2016 2 commits
  8. 19 May, 2016 1 commit
    • Alexey Kardashevskiy's avatar
      vfio_pci: Test for extended capabilities if config space > 256 bytes · f7055280
      Alexey Kardashevskiy authored
      PCI-Express spec says that reading 4 bytes at offset 100h should return
      zero if there is no extended capability so VFIO reads this dword to
      know if there are extended capabilities.
      
      However it is not always possible to access the extended space so
      generic PCI code in pci_cfg_space_size_ext() checks if
      pci_read_config_dword() can read beyond 100h and if the check fails,
      it sets the config space size to 100h.
      
      VFIO does its own extended capabilities check by reading at offset 100h
      which may produce 0xffffffff which VFIO treats as the extended config
      space presense and calls vfio_ecap_init() which fails to parse
      capabilities (which is expected) but right before the exit, it writes
      zero at offset 100h which is beyond the buffer allocated for
      vdev->vconfig (which is 256 bytes) which leads to random memory
      corruption.
      
      This makes VFIO only check for the extended capabilities if
      the discovered config size is more than 256 bytes.
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      f7055280
  9. 11 May, 2016 1 commit
  10. 09 May, 2016 1 commit
    • Robin Murphy's avatar
      iommu: Allow selecting page sizes per domain · d16e0faa
      Robin Murphy authored
      Many IOMMUs support multiple page table formats, meaning that any given
      domain may only support a subset of the hardware page sizes presented in
      iommu_ops->pgsize_bitmap. There are also certain use-cases where the
      creator of a domain may want to control which page sizes are used, for
      example to force the use of hugepage mappings to reduce pagetable walk
      depth.
      
      To this end, add a per-domain pgsize_bitmap to represent the subset of
      page sizes actually in use, to make it possible for domains with
      different requirements to coexist.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      [rm: hijacked and rebased original patch with new commit message]
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      d16e0faa
  11. 28 Apr, 2016 3 commits
    • Alexey Kardashevskiy's avatar
      vfio_iommu_spapr_tce: Remove unneeded iommu_group_get_iommudata · 5ed4aba1
      Alexey Kardashevskiy authored
      This removes iommu_group_get_iommudata() as the result is never used.
      As this is a minor cleanup, no change in behavior is expected.
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      5ed4aba1
    • Alex Williamson's avatar
      vfio/pci: Add test for BAR restore · dc928109
      Alex Williamson authored
      If a device is reset without the memory or i/o bits enabled in the
      command register we may not detect it, potentially leaving the device
      without valid BAR programming.  Add an additional test to check the
      BARs on each write to the command register.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      dc928109
    • Alex Williamson's avatar
      vfio/pci: Hide broken INTx support from user · 45074405
      Alex Williamson authored
      INTx masking has two components, the first is that we need the ability
      to prevent the device from continuing to assert INTx.  This is
      provided via the DisINTx bit in the command register and is the only
      thing we can really probe for when testing if INTx masking is
      supported.  The second component is that the device needs to indicate
      if INTx is asserted via the interrupt status bit in the device status
      register.  With these two features we can generically determine if one
      of the devices we own is asserting INTx, signal the user, and mask the
      interrupt while the user services the device.
      
      Generally if one or both of these components is broken we resort to
      APIC level interrupt masking, which requires an exclusive interrupt
      since we have no way to determine the source of the interrupt in a
      shared configuration.  This often makes it difficult or impossible to
      configure the system for userspace use of the device, for an interrupt
      mode that the user may not need.
      
      One possible configuration of broken INTx masking is that the DisINTx
      support is fully functional, but the interrupt status bit never
      signals interrupt assertion.  In this case we do have the ability to
      prevent the device from asserting INTx, but lack the ability to
      identify the interrupt source.  For this case we can simply pretend
      that the device lacks INTx support entirely, keeping DisINTx set on
      the physical device, virtualizing this bit for the user, and
      virtualizing the interrupt pin register to indicate no INTx support.
      We already support virtualization of the DisINTx bit and already
      virtualize the interrupt pin for platforms without INTx support.  By
      tying these components together, setting DisINTx on open and reset,
      and identifying devices broken in this particular way, we can provide
      support for them w/o the handicap of APIC level INTx masking.
      
      Intel i40e (XL710/X710) 10/20/40GbE NICs have been identified as being
      broken in this specific way.  We leave the vfio-pci.nointxmask option
      as a mechanism to bypass this support, enabling INTx on the device
      with all the requirements of APIC level masking.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Cc: John Ronciak <john.ronciak@intel.com>
      Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
      45074405
  12. 28 Feb, 2016 1 commit
  13. 25 Feb, 2016 1 commit
  14. 22 Feb, 2016 9 commits
  15. 27 Jan, 2016 1 commit
  16. 04 Jan, 2016 1 commit
  17. 21 Dec, 2015 2 commits
    • Alex Williamson's avatar
      vfio: Include No-IOMMU mode · 03a76b60
      Alex Williamson authored
      There is really no way to safely give a user full access to a DMA
      capable device without an IOMMU to protect the host system.  There is
      also no way to provide DMA translation, for use cases such as device
      assignment to virtual machines.  However, there are still those users
      that want userspace drivers even under those conditions.  The UIO
      driver exists for this use case, but does not provide the degree of
      device access and programming that VFIO has.  In an effort to avoid
      code duplication, this introduces a No-IOMMU mode for VFIO.
      
      This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling
      the "enable_unsafe_noiommu_mode" option on the vfio driver.  This
      should make it very clear that this mode is not safe.  Additionally,
      CAP_SYS_RAWIO privileges are necessary to work with groups and
      containers using this mode.  Groups making use of this support are
      named /dev/vfio/noiommu-$GROUP and can only make use of the special
      VFIO_NOIOMMU_IOMMU for the container.  Use of this mode, specifically
      binding a device without a native IOMMU group to a VFIO bus driver
      will taint the kernel and should therefore not be considered
      supported.  This patch includes no-iommu support for the vfio-pci bus
      driver only.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      03a76b60
    • Dan Carpenter's avatar
      VFIO: platform: reset: fix a warning message condition · 96762882
      Dan Carpenter authored
      This loop ends with count set to -1 and not zero so the warning message
      isn't printed when it should be.  I've fixed this by change the postop
      to a preop.
      
      Fixes: 0990822c ('VFIO: platform: reset: AMD xgbe reset module')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarEric Auger <eric.auger@linaro.org>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      96762882
  18. 04 Dec, 2015 1 commit
  19. 21 Nov, 2015 1 commit
  20. 20 Nov, 2015 1 commit