1. 02 Aug, 2016 1 commit
    • Jason Wang's avatar
      vhost: new device IOTLB API · 6b1e6cc7
      Jason Wang authored
      This patch tries to implement an device IOTLB for vhost. This could be
      used with userspace(qemu) implementation of DMA remapping
      to emulate an IOMMU for the guest.
      
      The idea is simple, cache the translation in a software device IOTLB
      (which is implemented as an interval tree) in vhost and use vhost_net
      file descriptor for reporting IOTLB miss and IOTLB
      update/invalidation. When vhost meets an IOTLB miss, the fault
      address, size and access can be read from the file. After userspace
      finishes the translation, it writes the translated address to the
      vhost_net file to update the device IOTLB.
      
      When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq
      addresses set by ioctl are treated as iova instead of virtual address and
      the accessing can only be done through IOTLB instead of direct userspace
      memory access. Before each round or vq processing, all vq metadata is
      prefetched in device IOTLB to make sure no translation fault happens
      during vq processing.
      
      In most cases, virtqueues are contiguous even in virtual address space.
      The IOTLB translation for virtqueue itself may make it a little
      slower. We might add fast path cache on top of this patch.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ]
      [mst: fix build warnings ]
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [ weiyj.lk: missing unlock on error ]
      Signed-off-by: default avatarWei Yongjun <weiyj.lk@gmail.com>
      6b1e6cc7
  2. 01 Aug, 2016 1 commit
    • Jason Wang's avatar
      vhost: convert pre sorted vhost memory array to interval tree · a9709d68
      Jason Wang authored
      Current pre-sorted memory region array has some limitations for future
      device IOTLB conversion:
      
      1) need extra work for adding and removing a single region, and it's
         expected to be slow because of sorting or memory re-allocation.
      2) need extra work of removing a large range which may intersect
         several regions with different size.
      3) need trick for a replacement policy like LRU
      
      To overcome the above shortcomings, this patch convert it to interval
      tree which can easily address the above issue with almost no extra
      work.
      
      The patch could be used for:
      
      - Extend the current API and only let the userspace to send diffs of
        memory table.
      - Simplify Device IOTLB implementation.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      a9709d68
  3. 01 Jul, 2016 1 commit
    • Jason Wang's avatar
      tun: switch to use skb array for tx · 1576d986
      Jason Wang authored
      We used to queue tx packets in sk_receive_queue, this is less
      efficient since it requires spinlocks to synchronize between producer
      and consumer.
      
      This patch tries to address this by:
      
      - switch from sk_receive_queue to a skb_array, and resize it when
        tx_queue_len was changed.
      - introduce a new proto_ops peek_len which was used for peeking the
        skb length.
      - implement a tun version of peek_len for vhost_net to use and convert
        vhost_net to use peek_len if possible.
      
      Pktgen test shows about 15.3% improvement on guest receiving pps for small
      buffers:
      
      Before: ~1300000pps
      After : ~1500000pps
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1576d986
  4. 07 Jun, 2016 1 commit
    • Jason Wang's avatar
      vhost_net: stop polling socket during rx processing · 8241a1e4
      Jason Wang authored
      We don't stop rx polling socket during rx processing, this will lead
      unnecessary wakeups from under layer net devices (E.g
      sock_def_readable() form tun). Rx will be slowed down in this
      way. This patch avoids this by stop polling socket during rx
      processing. A small drawback is that this introduces some overheads in
      light load case because of the extra start/stop polling, but single
      netperf TCP_RR does not notice any change. In a super heavy load case,
      e.g using pktgen to inject packet to guest, we get about ~8.8%
      improvement on pps:
      
      before: ~1240000 pkt/s
      after:  ~1350000 pkt/s
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8241a1e4
  5. 10 Mar, 2016 1 commit
  6. 02 Mar, 2016 1 commit
    • Greg Kurz's avatar
      vhost: rename vhost_init_used() · 80f7d030
      Greg Kurz authored
      Looking at how callers use this, maybe we should just rename init_used
      to vhost_vq_init_access. The _used suffix was a hint that we
      access the vq used ring. But maybe what callers care about is
      that it must be called after access_ok.
      
      Also, this function manipulates the vq->is_le field which isn't related
      to the vq used ring.
      
      This patch simply renames vhost_init_used() to vhost_vq_init_access() as
      suggested by Michael.
      
      No behaviour change.
      Signed-off-by: default avatarGreg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      80f7d030
  7. 16 Sep, 2015 1 commit
  8. 11 Apr, 2015 1 commit
  9. 02 Mar, 2015 1 commit
  10. 27 Feb, 2015 2 commits
  11. 15 Feb, 2015 1 commit
  12. 04 Feb, 2015 1 commit
  13. 03 Feb, 2015 2 commits
  14. 13 Jan, 2015 1 commit
  15. 07 Jan, 2015 1 commit
  16. 09 Dec, 2014 5 commits
  17. 23 Jun, 2014 1 commit
  18. 09 Jun, 2014 3 commits
    • Michael S. Tsirkin's avatar
      vhost: move memory pointer to VQs · 47283bef
      Michael S. Tsirkin authored
      commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
          vhost: replace rcu with mutex
      replaced rcu sync for memory accesses with VQ mutex locl/unlock.
      This is correct since all accesses are under VQ mutex, but incomplete:
      we still do useless rcu lock/unlock operations, someone might copy this
      code into some other context where this won't be right.
      This use of RCU is also non standard and hard to understand.
      Let's copy the pointer to each VQ structure, this way
      the access rules become straight-forward, and there's
      no need for RCU anymore.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      47283bef
    • Michael S. Tsirkin's avatar
      vhost: move acked_features to VQs · ea16c514
      Michael S. Tsirkin authored
      Refactor code to make sure features are only accessed
      under VQ mutex. This makes everything simpler, no need
      for RCU here anymore.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      ea16c514
    • Michael S. Tsirkin's avatar
      vhost-net: extend device allocation to vmalloc · 23cc5a99
      Michael S. Tsirkin authored
      Michael Mueller provided a patch to reduce the size of
      vhost-net structure as some allocations could fail under
      memory pressure/fragmentation. We are still left with
      high order allocations though.
      
      This patch is handling the problem at the core level, allowing
      vhost structures to use vmalloc() if kmalloc() failed.
      
      As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
      to kzalloc() flags to do this fallback only when really needed.
      
      People are still looking at cleaner ways to handle the problem
      at the API level, probably passing in multiple iovecs.
      This hack seems consistent with approaches
      taken since then by drivers/vhost/scsi.c and net/core/dev.c
      
      Based on patch by Romain Francoise.
      
      Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
      Signed-off-by: default avatarRomain Francoise <romain@orebokech.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      23cc5a99
  19. 01 Apr, 2014 1 commit
  20. 28 Mar, 2014 2 commits
  21. 13 Feb, 2014 2 commits
  22. 06 Dec, 2013 1 commit
  23. 03 Sep, 2013 5 commits
  24. 11 Jul, 2013 2 commits
  25. 09 Jul, 2013 1 commit