1. 23 Jul, 2014 2 commits
    • Jason Wang's avatar
      virtio-net: rx busy polling support · 91815639
      Jason Wang authored
      
      
      Add basic support for rx busy polling. Instead of introducing new
      states and spinlock to synchronize between NAPI and polling method,
      this patch just reuse NAPI state to avoid extra overhead for fast path
      and simplified the codes.
      
      Test was done between a kvm guest and an external host. Two hosts were
      connected through 40gb mlx4 cards. With both busy_poll and busy_read
      are set to 50 in guest, 1 byte netperf tcp_rr shows 127% improvement:
      transaction rate was increased from 8353.33 to 18966.87.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      91815639
    • Jason Wang's avatar
      virtio-net: introduce virtnet_receive() · 2ffa7598
      Jason Wang authored
      
      
      Move common receive logic to a new helper virtnet_receive(). It will
      also be used by rx busy polling method.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Vlad Yasevich <vyasevic@redhat.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ffa7598
  2. 13 May, 2014 1 commit
  3. 30 Apr, 2014 1 commit
    • Zhangjie \(HZ\)'s avatar
      virtio-net: Set needed_headroom for virtio-net when VIRTIO_F_ANY_LAYOUT is true · 6ebbc1a6
      Zhangjie \(HZ\) authored
      This is a small supplement for commit e7428e95
      
      
      ("virtio-net: put virtio-net header inline with data"). TCP packages have
      enough room to put virtio-net header in, but UDP packages do not. By
      setting dev->needed_headroom for virtio-net device, UDP packages could have
      enough room.
      
      For UDP packages, sk_buff is alloced in fun __ip_append_data. The size is
      "alloclen + hh_len + 15", and "hh_len = LL_RESERVED_SPACE(rt-dst.dev);".
      The Macro is defined as follows:
      #define LL_RESERVED_SPACE(dev) \
           ((((dev)->hard_header_len+(dev)->needed_headroom)\
           &~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
      By default, for UDP packages, after skb is allocated, only 16 bytes
      reserved. And 2 bytes remained after mac header is set. That is not enough
      to put virtio-net header in. If we set dev->needed_headroom to 12 or 10
      (according to mergeable_rx_bufs is on or off ), more room can be reserved.
      Then there is enough room for UDP packages to put the header in.
      
      test result list as below:
      guest and host: suse11sp3, netperf, intel 2.4GHz
      +-------+---------+---------+---------+---------+
      |       |   old             |   new             |
      +-------+---------+---------+---------+---------+
      | UDP   |  Gbit/s | pps     |  Gbit/s | pps     |
      | 64    |  0.57   | 692232  |  0.61   | 742420  |
      | 256   |  1.60   | 686860  |  1.71   | 733331  |
      | 512   |  2.92   | 674576  |  3.07   | 710446  |
      | 1024  |  4.99   | 598977  |  5.17   | 620821  |
      | 1460  |  5.68   | 483757  |  7.16   | 610519  |
      | 4096  |  6.98   | 637468  |  7.21   | 658471  |
      +-------+---------+---------+---------+---------+
      Signed-off-by: default avatarZhang Jie <zhangjie14@huawei.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ebbc1a6
  4. 22 Apr, 2014 1 commit
  5. 27 Mar, 2014 1 commit
    • Jason Wang's avatar
      virtio-net: correct error handling of virtqueue_kick() · 681daee2
      Jason Wang authored
      Current error handling of virtqueue_kick() was wrong in two places:
      - The skb were freed immediately when virtqueue_kick() fail during
        xmit. This may lead double free since the skb was not detached from
        the virtqueue.
      - try_fill_recv() returns false when virtqueue_kick() fail. This will
        lead unnecessary rescheduling of refill work.
      
      Actually, it's safe to just ignore the kick failure in those two
      places. So this patch fixes this by partially revert commit
      67975901.
      
      Fixes 67975901
      
      
      (virtio_net: verify if virtqueue_kick() succeeded).
      
      Cc: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      681daee2
  6. 24 Mar, 2014 1 commit
  7. 14 Mar, 2014 1 commit
  8. 12 Mar, 2014 1 commit
  9. 24 Feb, 2014 1 commit
  10. 17 Jan, 2014 3 commits
    • Michael Dalton's avatar
      virtio-net: initial rx sysfs support, export mergeable rx buffer size · fbf28d78
      Michael Dalton authored
      
      
      Add initial support for per-rx queue sysfs attributes to virtio-net. If
      mergeable packet buffers are enabled, adds a read-only mergeable packet
      buffer size sysfs attribute for each RX queue.
      Suggested-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbf28d78
    • Michael Dalton's avatar
      virtio-net: auto-tune mergeable rx buffer size for improved performance · ab7db917
      Michael Dalton authored
      Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page frag
      allocators") changed the mergeable receive buffer size from PAGE_SIZE to
      MTU-size, introducing a single-stream regression for benchmarks with large
      average packet size. There is no single optimal buffer size for all
      workloads.  For workloads with packet size <= MTU bytes, MTU + virtio-net
      header-sized buffers are preferred as larger buffers reduce the TCP window
      due to SKB truesize. However, single-stream workloads with large average
      packet sizes have higher throughput if larger (e.g., PAGE_SIZE) buffers
      are used.
      
      This commit auto-tunes the mergeable receiver buffer packet size by
      choosing the packet buffer size based on an EWMA of the recent packet
      sizes for the receive queue. Packet buffer sizes range from MTU_SIZE +
      virtio-net header len to PAGE_SIZE. This improves throughput for
      large packet workloads, as any workload with average packet size >=
      PAGE_SIZE will use PAGE_SIZE buffers.
      
      These optimizations interact positively with recent commit
      ba275241 ("virtio-net: coalesce rx frags when possible during rx"),
      which coalesces adjacent RX SKB fragments in virtio_net. The coalescing
      optimizations benefit buffers of any size.
      
      Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
      between two QEMU VMs on a single physical machine. Each VM has two VCPUs
      with all offloads & vhost enabled. All VMs and vhost threads run in a
      single 4 CPU cgroup cpuset, using cgroups to ensure that other processes
      in the system will not be scheduled on the benchmark CPUs. Trunk includes
      SKB rx frag coalescing.
      
      net-next w/ virtio_net before 2613af0e
      
       (PAGE_SIZE bufs): 14642.85Gb/s
      net-next (MTU-size bufs):  13170.01Gb/s
      net-next + auto-tune: 14555.94Gb/s
      
      Jason Wang also reported a throughput increase on mlx4 from 22Gb/s
      using MTU-sized buffers to about 26Gb/s using auto-tuning.
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab7db917
    • Michael Dalton's avatar
      virtio-net: use per-receive queue page frag alloc for mergeable bufs · fb51879d
      Michael Dalton authored
      
      
      The virtio-net driver currently uses netdev_alloc_frag() for GFP_ATOMIC
      mergeable rx buffer allocations. This commit migrates virtio-net to use
      per-receive queue page frags for GFP_ATOMIC allocation. This change unifies
      mergeable rx buffer memory allocation, which now will use skb_refill_frag()
      for both atomic and GFP-WAIT buffer allocations.
      
      To address fragmentation concerns, if after buffer allocation there
      is too little space left in the page frag to allocate a subsequent
      buffer, the remaining space is added to the current allocated buffer
      so that the remaining space can be used to store packet data.
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb51879d
  11. 16 Jan, 2014 1 commit
  12. 02 Jan, 2014 1 commit
    • Jason Wang's avatar
      virtio-net: fix refill races during restore · 6cd4ce00
      Jason Wang authored
      During restoring, try_fill_recv() was called with neither napi lock nor napi
      disabled. This can lead two try_fill_recv() was called in the same time. Fix
      this by refilling before trying to enable napi.
      
      Fixes 0741bcb5
      
      
      (virtio: net: Add freeze, restore handlers to support S4).
      
      Cc: Amit Shah <amit.shah@redhat.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cd4ce00
  13. 10 Dec, 2013 2 commits
  14. 06 Dec, 2013 4 commits
    • Michael Dalton's avatar
      virtio-net: free bufs correctly on invalid packet length · 98bfd23c
      Michael Dalton authored
      
      
      When a packet with invalid length arrives, ensure that the packet
      is freed correctly if mergeable packet buffers and big packets
      (GUEST_TSO4) are both enabled.
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarAndrew Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98bfd23c
    • Andrey Vagin's avatar
      virtio: delete napi structures from netdev before releasing memory · d4fb84ee
      Andrey Vagin authored
      free_netdev calls netif_napi_del too, but it's too late, because napi
      structures are placed on vi->rq. netif_napi_add() is called from
      virtnet_alloc_queues.
      
      general protection fault: 0000 [#1] SMP
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables virtio_balloon pcspkr virtio_net(-) i2c_pii
      CPU: 1 PID: 347 Comm: rmmod Not tainted 3.13.0-rc2+ #171
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff8800b779c420 ti: ffff8800379e0000 task.ti: ffff8800379e0000
      RIP: 0010:[<ffffffff81322e19>]  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
      RSP: 0018:ffff8800379e1dd0  EFLAGS: 00010a83
      RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800379c2fd0 RCX: dead000000200200
      RDX: 6b6b6b6b6b6b6b6b RSI: 0000000000000001 RDI: ffff8800379c2fd0
      RBP: ffff8800379e1dd0 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800379c2f90
      R13: ffff880037839160 R14: 0000000000000000 R15: 00000000013352f0
      FS:  00007f1400e34740(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f464124c763 CR3: 00000000b68cf000 CR4: 00000000000006e0
      Stack:
       ffff8800379e1df0 ffffffff8155beab 6b6b6b6b6b6b6b2b ffff8800378391c0
       ffff8800379e1e18 ffffffff8156499b ffff880037839be0 ffff880037839d20
       ffff88003779d3f0 ffff8800379e1e38 ffffffffa003477c ffff88003779d388
      Call Trace:
       [<ffffffff8155beab>] netif_napi_del+0x1b/0x80
       [<ffffffff8156499b>] free_netdev+0x8b/0x110
       [<ffffffffa003477c>] virtnet_remove+0x7c/0x90 [virtio_net]
       [<ffffffff813ae323>] virtio_dev_remove+0x23/0x80
       [<ffffffff813f62ef>] __device_release_driver+0x7f/0xf0
       [<ffffffff813f6ca0>] driver_detach+0xc0/0xd0
       [<ffffffff813f5f28>] bus_remove_driver+0x58/0xd0
       [<ffffffff813f72ec>] driver_unregister+0x2c/0x50
       [<ffffffff813ae65e>] unregister_virtio_driver+0xe/0x10
       [<ffffffffa0036942>] virtio_net_driver_exit+0x10/0x6ce [virtio_net]
       [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
       [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
       [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
      Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00
      RIP  [<ffffffff81322e19>] __list_del_entry+0x29/0xd0
       RSP <ffff8800379e1dd0>
      ---[ end trace d5931cd3f87c9763 ]---
      
      Fixes: 986a4f4d
      
       (virtio_net: multiqueue support)
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4fb84ee
    • Andrey Vagin's avatar
      virtio-net: determine type of bufs correctly · fa9fac17
      Andrey Vagin authored
      free_unused_bufs must check vi->mergeable_rx_bufs before
      vi->big_packets, because we use this sequence in other places.
      Otherwise we allocate buffer of one type, then free it as another
      type.
      
      general protection fault: 0000 [#1] SMP
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables pcspkr virtio_balloon virtio_net(-) i2c_pii
      CPU: 0 PID: 400 Comm: rmmod Not tainted 3.13.0-rc2+ #170
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      task: ffff8800b6d2a210 ti: ffff8800aed32000 task.ti: ffff8800aed32000
      RIP: 0010:[<ffffffffa00345f3>]  [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
      RSP: 0018:ffff8800aed33dd8  EFLAGS: 00010202
      RAX: ffff8800b1fe2c00 RBX: ffff8800b66a7240 RCX: 6b6b6b6b6b6b6b6b
      RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800b8419a68 RDI: ffff8800b66a1148
      RBP: ffff8800aed33e00 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
      R13: ffff8800b66a1148 R14: 0000000000000000 R15: 000077ff80000000
      FS:  00007fc4f9c4e740(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007f63f432f000 CR3: 00000000b6538000 CR4: 00000000000006f0
      Stack:
       ffff8800b66a7240 ffff8800b66a7380 ffff8800377bd3f0 0000000000000000
       00000000023302f0 ffff8800aed33e18 ffffffffa00346e2 ffff8800b66a7240
       ffff8800aed33e38 ffffffffa003474d ffff8800377bd388 ffff8800377bd390
      Call Trace:
       [<ffffffffa00346e2>] remove_vq_common+0x22/0x40 [virtio_net]
       [<ffffffffa003474d>] virtnet_remove+0x4d/0x90 [virtio_net]
       [<ffffffff813ae303>] virtio_dev_remove+0x23/0x80
       [<ffffffff813f62cf>] __device_release_driver+0x7f/0xf0
       [<ffffffff813f6c80>] driver_detach+0xc0/0xd0
       [<ffffffff813f5f08>] bus_remove_driver+0x58/0xd0
       [<ffffffff813f72cc>] driver_unregister+0x2c/0x50
       [<ffffffff813ae63e>] unregister_virtio_driver+0xe/0x10
       [<ffffffffa0036852>] virtio_net_driver_exit+0x10/0x7be [virtio_net]
       [<ffffffff810d7cf2>] SyS_delete_module+0x172/0x220
       [<ffffffff810a732d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff810f5d4c>] ? __audit_syscall_entry+0x9c/0xf0
       [<ffffffff81677f69>] system_call_fastpath+0x16/0x1b
      Code: c0 74 55 0f 1f 44 00 00 80 7b 30 00 74 7a 48 8b 50 30 4c 89 e6 48 03 73 20 48 85 d2 0f 84 bb 00 00 00 66 0f
      RIP  [<ffffffffa00345f3>] free_unused_bufs+0xc3/0x190 [virtio_net]
       RSP <ffff8800aed33dd8>
      ---[ end trace edb570ea923cce9c ]---
      
      Fixes: 2613af0e
      
       (virtio_net: migrate mergeable rx buffers to page frag allocators)
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa9fac17
    • Jeff Kirsher's avatar
      drivers/net/*: Fix FSF address in file headers · adf8d3ff
      Jeff Kirsher authored
      Several files refer to an old address for the Free Software Foundation
      in the file header comment.  Resolve by replacing the address with
      the URL <http://www.gnu.org/licenses/
      
      > so that we do not have to keep
      updating the header comments anytime the address changes.
      
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: Veaceslav Falico <vfalico@redhat.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Haiyang Zhang <haiyangz@microsoft.com>
      CC: "K. Y. Srinivasan" <kys@microsoft.com>
      CC: Paul Mackerras <paulus@samba.org>
      CC: Ian Campbell <ian.campbell@citrix.com>
      CC: Wei Liu <wei.liu2@citrix.com>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Acked-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adf8d3ff
  15. 01 Dec, 2013 2 commits
  16. 30 Nov, 2013 1 commit
  17. 18 Nov, 2013 1 commit
  18. 14 Nov, 2013 1 commit
    • Michael Dalton's avatar
      virtio-net: mergeable buffer size should include virtio-net header · 5061de36
      Michael Dalton authored
      Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page
      frag allocators") changed the mergeable receive buffer size from PAGE_SIZE
      to MTU-size. However, the merge buffer size does not take into account the
      size of the virtio-net header. Consequently, packets that are MTU-size
      will take two buffers intead of one (to store the virtio-net header),
      substantially decreasing the throughput of MTU-size traffic due to TCP
      window / SKB truesize effects.
      
      This commit changes the mergeable buffer size to include the virtio-net
      header. The buffer size is cacheline-aligned because skb_page_frag_refill
      will not automatically align the requested size.
      
      Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
      between two QEMU VMs on a single physical machine. Each VM has two VCPUs and
      vhost enabled. All VMs and vhost threads run in a single 4 CPU cgroup
      cpuset, using cgroups to ensure that other processes in the system will not
      be scheduled on the benchmark CPUs. Transmit offloads and mergeable receive
      buffers are enabled, but guest_tso4 / guest_csum are explicitly disabled to
      force MTU-sized packets on the receiver.
      
      next-net trunk before 2613af0e
      
       (PAGE_SIZE buf): 3861.08Gb/s
      net-next trunk (MTU 1500- packet uses two buf due to size bug): 4076.62Gb/s
      net-next trunk (MTU 1480- packet fits in one buf): 6301.34Gb/s
      net-next trunk w/ size fix (MTU 1500 - packet fits in one buf): 6445.44Gb/s
      Suggested-by: default avatarEric Northup <digitaleric@google.com>
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5061de36
  19. 06 Nov, 2013 1 commit
    • John Stultz's avatar
      net: Explicitly initialize u64_stats_sync structures for lockdep · 827da44c
      John Stultz authored
      
      
      In order to enable lockdep on seqcount/seqlock structures, we
      must explicitly initialize any locks.
      
      The u64_stats_sync structure, uses a seqcount, and thus we need
      to introduce a u64_stats_init() function and use it to initialize
      the structure.
      
      This unfortunately adds a lot of fairly trivial initialization code
      to a number of drivers. But the benefit of ensuring correctness makes
      this worth while.
      
      Because these changes are required for lockdep to be enabled, and the
      changes are quite trivial, I've not yet split this patch out into 30-some
      separate patches, as I figured it would be better to get the various
      maintainers thoughts on how to best merge this change along with
      the seqcount lockdep enablement.
      
      Feedback would be appreciated!
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Jesse Gross <jesse@nicira.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Mirko Lindner <mlindner@marvell.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Roger Luethi <rl@hellgate.ch>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Wensong Zhang <wensong@linux-vs.org>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      827da44c
  20. 05 Nov, 2013 1 commit
  21. 04 Nov, 2013 1 commit
    • Jason Wang's avatar
      virtio-net: coalesce rx frags when possible during rx · ba275241
      Jason Wang authored
      Commit 2613af0e (virtio_net: migrate mergeable
      rx buffers to page frag allocators) try to increase the payload/truesize for
      MTU-sized traffic. But this will introduce the extra overhead for GSO packets
      received because of the frag list. This commit tries to reduce this issue by
      coalesce the possible rx frags when possible during rx. Test result shows the
      about 15% improvement on full size GSO packet receiving (and even better than
      before commit 2613af0e
      
      ).
      
      Before this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    20303.87
      
      After this commit:
      ./netperf -H 192.168.100.4
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.100.4
      () port 0 AF_INET : demo
      Recv   Send    Send
      Socket Socket  Message  Elapsed
      Size   Size    Size     Time     Throughput
      bytes  bytes   bytes    secs.    10^6bits/sec
      
       87380  16384  16384    10.00    23841.26
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Michael Dalton <mwdalton@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba275241
  22. 29 Oct, 2013 1 commit
    • Jason Wang's avatar
      virtio-net: correctly handle cpu hotplug notifier during resuming · ec9debbd
      Jason Wang authored
      commit 3ab098df
      
       (virtio-net: don't respond to
      cpu hotplug notifier if we're not ready) tries to bypass the cpu hotplug
      notifier by checking the config_enable and does nothing is it was false. So it
      need to try to hold the config_lock mutex which may happen in atomic
      environment which leads the following warnings:
      
      [  622.944441] CPU0 attaching NULL sched-domain.
      [  622.944446] CPU1 attaching NULL sched-domain.
      [  622.944485] CPU0 attaching NULL sched-domain.
      [  622.950795] BUG: sleeping function called from invalid context at kernel/mutex.c:616
      [  622.950796] in_atomic(): 1, irqs_disabled(): 1, pid: 10, name: migration/1
      [  622.950796] no locks held by migration/1/10.
      [  622.950798] CPU: 1 PID: 10 Comm: migration/1 Not tainted 3.12.0-rc5-wl-01249-gb91e82d #317
      [  622.950799] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  622.950802]  0000000000000000 ffff88001d42dba0 ffffffff81a32f22 ffff88001bfb9c70
      [  622.950803]  ffff88001d42dbb0 ffffffff810edb02 ffff88001d42dc38 ffffffff81a396ed
      [  622.950805]  0000000000000046 ffff88001d42dbe8 ffffffff810e861d 0000000000000000
      [  622.950805] Call Trace:
      [  622.950810]  [<ffffffff81a32f22>] dump_stack+0x54/0x74
      [  622.950815]  [<ffffffff810edb02>] __might_sleep+0x112/0x114
      [  622.950817]  [<ffffffff81a396ed>] mutex_lock_nested+0x3c/0x3c6
      [  622.950818]  [<ffffffff810e861d>] ? up+0x39/0x3e
      [  622.950821]  [<ffffffff8153ea7c>] ? acpi_os_signal_semaphore+0x21/0x2d
      [  622.950824]  [<ffffffff81565ed1>] ? acpi_ut_release_mutex+0x5e/0x62
      [  622.950828]  [<ffffffff816d04ec>] virtnet_cpu_callback+0x33/0x87
      [  622.950830]  [<ffffffff81a42576>] notifier_call_chain+0x3c/0x5e
      [  622.950832]  [<ffffffff810e86a8>] __raw_notifier_call_chain+0xe/0x10
      [  622.950835]  [<ffffffff810c5556>] __cpu_notify+0x20/0x37
      [  622.950836]  [<ffffffff810c5580>] cpu_notify+0x13/0x15
      [  622.950838]  [<ffffffff81a237cd>] take_cpu_down+0x27/0x3a
      [  622.950841]  [<ffffffff81136289>] stop_machine_cpu_stop+0x93/0xf1
      [  622.950842]  [<ffffffff81136167>] cpu_stopper_thread+0xa0/0x12f
      [  622.950844]  [<ffffffff811361f6>] ? cpu_stopper_thread+0x12f/0x12f
      [  622.950847]  [<ffffffff81119710>] ? lock_release_holdtime.part.7+0xa3/0xa8
      [  622.950848]  [<ffffffff81135e4b>] ? cpu_stop_should_run+0x3f/0x47
      [  622.950850]  [<ffffffff810ea9b0>] smpboot_thread_fn+0x1c5/0x1e3
      [  622.950852]  [<ffffffff810ea7eb>] ? lg_global_unlock+0x67/0x67
      [  622.950854]  [<ffffffff810e36b7>] kthread+0xd8/0xe0
      [  622.950857]  [<ffffffff81a3bfad>] ? wait_for_common+0x12f/0x164
      [  622.950859]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
      [  622.950861]  [<ffffffff81a45ffc>] ret_from_fork+0x7c/0xb0
      [  622.950862]  [<ffffffff810e35df>] ? kthread_create_on_node+0x124/0x124
      [  622.950876] smpboot: CPU 1 is now offline
      [  623.194556] SMP alternatives: lockdep: fixing up alternatives
      [  623.194559] smpboot: Booting Node 0 Processor 1 APIC 0x1
      ...
      
      A correct fix is to unregister the hotcpu notifier during restore and register a
      new one in resume.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Cc: Wanlong Gao <gaowanlong@cn.fujitsu.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarWanlong Gao <gaowanlong@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec9debbd
  23. 28 Oct, 2013 3 commits
  24. 17 Oct, 2013 2 commits
  25. 16 Oct, 2013 1 commit
  26. 23 Sep, 2013 1 commit
  27. 03 Sep, 2013 1 commit
  28. 27 Jul, 2013 1 commit
  29. 09 Jul, 2013 1 commit
    • Michael S. Tsirkin's avatar
      virtio_net: fix race in RX VQ processing · cbdadbbf
      Michael S. Tsirkin authored
      
      
      virtio net called virtqueue_enable_cq on RX path after napi_complete, so
      with NAPI_STATE_SCHED clear - outside the implicit napi lock.
      This violates the requirement to synchronize virtqueue_enable_cq wrt
      virtqueue_add_buf.  In particular, used event can move backwards,
      causing us to lose interrupts.
      In a debug build, this can trigger panic within START_USE.
      
      Jason Wang reports that he can trigger the races artificially,
      by adding udelay() in virtqueue_enable_cb() after virtio_mb().
      
      However, we must call napi_complete to clear NAPI_STATE_SCHED before
      polling the virtqueue for used buffers, otherwise napi_schedule_prep in
      a callback will fail, causing us to lose RX events.
      
      To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
      set (under napi lock), later call virtqueue_poll with
      NAPI_STATE_SCHED clear (outside the lock).
      Reported-by: default avatarJason Wang <jasowang@redhat.com>
      Tested-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbdadbbf