1. 14 Aug, 2014 11 commits
    • Thomas Graf's avatar
      rhashtable: unexport and make rht_obj() static · c91eee56
      Thomas Graf authored
      No need to export rht_obj(), all inner to outer object translations
      occur internally. It was intended to be used with rht_for_each() which
      now primarily serves as the iterator for rhashtable_remove_pprev() to
      effectively flush and free the full table.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c91eee56
    • Thomas Graf's avatar
      rhashtable: RCU annotations for next pointers · 5300fdcb
      Thomas Graf authored
      Properly annotate next pointers as access is RCU protected in
      the lookup path.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5300fdcb
    • Neal Cardwell's avatar
      tcp: fix ssthresh and undo for consecutive short FRTO episodes · 0c9ab092
      Neal Cardwell authored
      Fix TCP FRTO logic so that it always notices when snd_una advances,
      indicating that any RTO after that point will be a new and distinct
      loss episode.
      
      Previously there was a very specific sequence that could cause FRTO to
      fail to notice a new loss episode had started:
      
      (1) RTO timer fires, enter FRTO and retransmit packet 1 in write queue
      (2) receiver ACKs packet 1
      (3) FRTO sends 2 more packets
      (4) RTO timer fires again (should start a new loss episode)
      
      The problem was in step (3) above, where tcp_process_loss() returned
      early (in the spot marked "Step 2.b"), so that it never got to the
      logic to clear icsk_retransmits. Thus icsk_retransmits stayed
      non-zero. Thus in step (4) tcp_enter_loss() would see the non-zero
      icsk_retransmits, decide that this RTO is not a new episode, and
      decide not to cut ssthresh and remember the current cwnd and ssthresh
      for undo.
      
      There were two main consequences to the bug that we have
      observed. First, ssthresh was not decreased in step (4). Second, when
      there was a series of such FRTO (1-4) sequences that happened to be
      followed by an FRTO undo, we would restore the cwnd and ssthresh from
      before the entire series started (instead of the cwnd and ssthresh
      from before the most recent RTO). This could result in cwnd and
      ssthresh being restored to values much bigger than the proper values.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Fixes: e33099f9 ("tcp: implement RFC5682 F-RTO")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c9ab092
    • Hannes Frederic Sowa's avatar
      tcp: don't allow syn packets without timestamps to pass tcp_tw_recycle logic · a26552af
      Hannes Frederic Sowa authored
      tcp_tw_recycle heavily relies on tcp timestamps to build a per-host
      ordering of incoming connections and teardowns without the need to
      hold state on a specific quadruple for TCP_TIMEWAIT_LEN, but only for
      the last measured RTO. To do so, we keep the last seen timestamp in a
      per-host indexed data structure and verify if the incoming timestamp
      in a connection request is strictly greater than the saved one during
      last connection teardown. Thus we can verify later on that no old data
      packets will be accepted by the new connection.
      
      During moving a socket to time-wait state we already verify if timestamps
      where seen on a connection. Only if that was the case we let the
      time-wait socket expire after the RTO, otherwise normal TCP_TIMEWAIT_LEN
      will be used. But we don't verify this on incoming SYN packets. If a
      connection teardown was less than TCP_PAWS_MSL seconds in the past we
      cannot guarantee to not accept data packets from an old connection if
      no timestamps are present. We should drop this SYN packet. This patch
      closes this loophole.
      
      Please note, this patch does not make tcp_tw_recycle in any way more
      usable but only adds another safety check:
      Sporadic drops of SYN packets because of reordering in the network or
      in the socket backlog queues can happen. Users behing NAT trying to
      connect to a tcp_tw_recycle enabled server can get caught in blackholes
      and their connection requests may regullary get dropped because hosts
      behind an address translator don't have synchronized tcp timestamp clocks.
      tcp_tw_recycle cannot work if peers don't have tcp timestamps enabled.
      
      In general, use of tcp_tw_recycle is disadvised.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Florian Westphal <fw@strlen.de>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a26552af
    • Neal Cardwell's avatar
      tcp: fix tcp_release_cb() to dispatch via address family for mtu_reduced() · 4fab9071
      Neal Cardwell authored
      Make sure we use the correct address-family-specific function for
      handling MTU reductions from within tcp_release_cb().
      
      Previously AF_INET6 sockets were incorrectly always using the IPv6
      code path when sometimes they were handling IPv4 traffic and thus had
      an IPv4 dst.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarWillem de Bruijn <willemb@google.com>
      Fixes: 563d34d0 ("tcp: dont drop MTU reduction indications")
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fab9071
    • Shmulik Ladkani's avatar
      sit: Fix ipip6_tunnel_lookup device matching criteria · bc8fc7b8
      Shmulik Ladkani authored
      As of 4fddbf5d ("sit: strictly restrict incoming traffic to tunnel link device"),
      when looking up a tunnel, tunnel's underlying interface (t->parms.link)
      is verified to match incoming traffic's ingress device.
      
      However the comparison was incorrectly based on skb->dev->iflink.
      
      Instead, dev->ifindex should be used, which correctly represents the
      interface from which the IP stack hands the ipip6 packets.
      
      This allows setting up sit tunnels bound to vlan interfaces (otherwise
      incoming ipip6 traffic on the vlan interface was dropped due to
      ipip6_tunnel_lookup match failure).
      Signed-off-by: default avatarShmulik Ladkani <shmulik.ladkani@gmail.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc8fc7b8
    • Andreas Ruprecht's avatar
      net: ethernet: ibm: ehea: Remove duplicate object from Makefile · 3b3e0ea8
      Andreas Ruprecht authored
      In the Makefile, ehea_phyp.o is included twice in the list of
      object files compile into ehea.o.
      
      This change removes one instance.
      Signed-off-by: default avatarAndreas Ruprecht <rupran@einserver.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b3e0ea8
    • Tobias Klauser's avatar
      net: xgene: Check negative return value of xgene_enet_get_ring_size() · 9b9ba821
      Tobias Klauser authored
      xgene_enet_get_ring_size() returns a negative value in case of an error,
      but its only caller in xgene_enet_create_desc_ring() currently uses the
      return value directly as u32. Instead, check for a negative value first and
      error out in case. Also move the call to xgene_enet_get_ring_size() before
      devm_kzalloc() so we don't need to free anything in the error path.
      
      This fixes the following issue reported by the Coverity Scanner:
      
      ** CID 1231336:  Improper use of negative value  (NEGATIVE_RETURNS)
      /drivers/net/ethernet/apm/xgene/xgene_enet_main.c: 596 in xgene_enet_create_desc_ring()
      Signed-off-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b9ba821
    • Andrey Vagin's avatar
      tcp: don't use timestamp from repaired skb-s to calculate RTT (v2) · 9d186cac
      Andrey Vagin authored
      We don't know right timestamp for repaired skb-s. Wrong RTT estimations
      isn't good, because some congestion modules heavily depends on it.
      
      This patch adds the TCPCB_REPAIRED flag, which is included in
      TCPCB_RETRANS.
      
      Thanks to Eric for the advice how to fix this issue.
      
      This patch fixes the warning:
      [  879.562947] WARNING: CPU: 0 PID: 2825 at net/ipv4/tcp_input.c:3078 tcp_ack+0x11f5/0x1380()
      [  879.567253] CPU: 0 PID: 2825 Comm: socket-tcpbuf-l Not tainted 3.16.0-next-20140811 #1
      [  879.567829] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  879.568177]  0000000000000000 00000000c532680c ffff880039643d00 ffffffff817aa2d2
      [  879.568776]  0000000000000000 ffff880039643d38 ffffffff8109afbd ffff880039d6ba80
      [  879.569386]  ffff88003a449800 000000002983d6bd 0000000000000000 000000002983d6bc
      [  879.569982] Call Trace:
      [  879.570264]  [<ffffffff817aa2d2>] dump_stack+0x4d/0x66
      [  879.570599]  [<ffffffff8109afbd>] warn_slowpath_common+0x7d/0xa0
      [  879.570935]  [<ffffffff8109b0ea>] warn_slowpath_null+0x1a/0x20
      [  879.571292]  [<ffffffff816d0a05>] tcp_ack+0x11f5/0x1380
      [  879.571614]  [<ffffffff816d10bd>] tcp_rcv_established+0x1ed/0x710
      [  879.571958]  [<ffffffff816dc9da>] tcp_v4_do_rcv+0x10a/0x370
      [  879.572315]  [<ffffffff81657459>] release_sock+0x89/0x1d0
      [  879.572642]  [<ffffffff816c81a0>] do_tcp_setsockopt.isra.36+0x120/0x860
      [  879.573000]  [<ffffffff8110a52e>] ? rcu_read_lock_held+0x6e/0x80
      [  879.573352]  [<ffffffff816c8912>] tcp_setsockopt+0x32/0x40
      [  879.573678]  [<ffffffff81654ac4>] sock_common_setsockopt+0x14/0x20
      [  879.574031]  [<ffffffff816537b0>] SyS_setsockopt+0x80/0xf0
      [  879.574393]  [<ffffffff817b40a9>] system_call_fastpath+0x16/0x1b
      [  879.574730] ---[ end trace a17cbc38eb8c5c00 ]---
      
      v2: moving setting of skb->when for repaired skb-s in tcp_write_xmit,
          where it's set for other skb-s.
      
      Fixes: 431a9124 ("tcp: timestamp SYN+DATA messages")
      Fixes: 740b0f18 ("tcp: switch rtt estimations to usec resolution")
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d186cac
    • Michal Simek's avatar
      net: xilinx: Remove .owner field for driver · fdd42e44
      Michal Simek authored
      There is no need to init .owner field.
      
      Based on the patch from Peter Griffin <peter.griffin@linaro.org>
      "mmc: remove .owner field for drivers using module_platform_driver"
      
      This patch removes the superflous .owner field for drivers which
      use the module_platform_driver API, as this is overriden in
      platform_driver_register anyway."
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdd42e44
    • David S. Miller's avatar
      Revert "macvlan: simplify the structure port" · 5e3c516b
      David S. Miller authored
      This reverts commit a188a54d.
      
      It causes crashes
      
      ====================
      [   80.643286] BUG: unable to handle kernel NULL pointer dereference at 0000000000000878
      [   80.670103] IP: [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.691289] PGD 22c102067 PUD 235bf0067 PMD 0
      [   80.706611] Oops: 0002 [#1] SMP
      [   80.717836] Modules linked in: macvlan nfsd lockd nfs_acl exportfs auth_rpcgss sunrpc oid_registry ioatdma ixgbe(-) mdio igb dca
      [   80.757935] CPU: 37 PID: 6724 Comm: rmmod Not tainted 3.16.0-net-next-08-12-2014-FCoE+ #1
      [   80.785688] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
      [   80.820310] task: ffff880235a9eae0 ti: ffff88022e844000 task.ti: ffff88022e844000
      [   80.845770] RIP: 0010:[<ffffffff810832e4>]  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   80.875326] RSP: 0018:ffff88022e847b28  EFLAGS: 00010046
      [   80.893251] RAX: 0000000000037a6a RBX: 0000000000000878 RCX: 0000000000000000
      [   80.917187] RDX: ffff880235a9eae0 RSI: 0000000000000001 RDI: ffffffff810832db
      [   80.941125] RBP: ffff88022e847b58 R08: 0000000000000000 R09: 0000000000000000
      [   80.965056] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022e847b70
      [   80.988994] R13: 0000000000000000 R14: ffff88022e847be8 R15: ffffffff81ebe440
      [   81.012929] FS:  00007fab90b07700(0000) GS:ffff88043f7a0000(0000) knlGS:0000000000000000
      [   81.040400] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   81.059757] CR2: 0000000000000878 CR3: 0000000235a42000 CR4: 00000000001407e0
      [   81.083689] Stack:
      [   81.090739]  ffff880235a9eae0 0000000000000878 ffff88022e847b70 0000000000000000
      [   81.116253]  ffff88022e847be8 ffffffff81ebe440 ffff88022e847b98 ffffffff810847f1
      [   81.141766]  ffff88022e847b78 0000000000000286 ffff880234200000 0000000000000000
      [   81.167282] Call Trace:
      [   81.175768]  [<ffffffff810847f1>] __cancel_work_timer+0x31/0x170
      [   81.195985]  [<ffffffff8108494b>] cancel_work_sync+0xb/0x10
      [   81.214769]  [<ffffffffa015ae68>] macvlan_port_destroy+0x28/0x60 [macvlan]
      [   81.237844]  [<ffffffffa015b930>] macvlan_uninit+0x40/0x50 [macvlan]
      [   81.259209]  [<ffffffff816bf6e2>] rollback_registered_many+0x1a2/0x2c0
      [   81.281140]  [<ffffffff816bf81a>] unregister_netdevice_many+0x1a/0xb0
      [   81.302786]  [<ffffffffa015a4ff>] macvlan_device_event+0x1ef/0x240 [macvlan]
      [   81.326439]  [<ffffffff8108a13d>] notifier_call_chain+0x4d/0x70
      [   81.346366]  [<ffffffff8108a201>] raw_notifier_call_chain+0x11/0x20
      [   81.367439]  [<ffffffff816bf25b>] call_netdevice_notifiers_info+0x3b/0x70
      [   81.390228]  [<ffffffff816bf2a1>] call_netdevice_notifiers+0x11/0x20
      [   81.411587]  [<ffffffff816bf6bd>] rollback_registered_many+0x17d/0x2c0
      [   81.433518]  [<ffffffff816bf925>] unregister_netdevice_queue+0x75/0x110
      [   81.455735]  [<ffffffff816bfb2b>] unregister_netdev+0x1b/0x30
      [   81.475094]  [<ffffffffa0039b50>] ixgbe_remove+0x170/0x1d0 [ixgbe]
      [   81.495886]  [<ffffffff813512a2>] pci_device_remove+0x32/0x60
      [   81.515246]  [<ffffffff814c75c4>] __device_release_driver+0x64/0xd0
      [   81.536321]  [<ffffffff814c76f8>] driver_detach+0xc8/0xd0
      [   81.554530]  [<ffffffff814c656e>] bus_remove_driver+0x4e/0xa0
      [   81.573888]  [<ffffffff814c828b>] driver_unregister+0x2b/0x60
      [   81.593246]  [<ffffffff8135143e>] pci_unregister_driver+0x1e/0xa0
      [   81.613749]  [<ffffffffa005db18>] ixgbe_exit_module+0x1c/0x2e [ixgbe]
      [   81.635401]  [<ffffffff810e738b>] SyS_delete_module+0x15b/0x1e0
      [   81.655334]  [<ffffffff8187a395>] ? sysret_check+0x22/0x5d
      [   81.673833]  [<ffffffff810abd2d>] ? trace_hardirqs_on_caller+0x11d/0x1e0
      [   81.696339]  [<ffffffff8132bfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [   81.717985]  [<ffffffff8187a369>] system_call_fastpath+0x16/0x1b
      [   81.738199] Code: 00 48 83 3d 6e bb da 00 00 48 89 c2 0f 84 67 01 00 00 fa 66 0f 1f 44 00 00 49 89 14 24 e8 b5 4b 02 00 45 84 ed 0f 85 ac 00 00 00 <f0> 0f ba 2b 00 72 1d 31 c0 48 8b 5d d8 4c 8b 65 e0 4c 8b 6d e8
      [   81.807026] RIP  [<ffffffff810832e4>] try_to_grab_pending+0x64/0x1f0
      [   81.828468]  RSP <ffff88022e847b28>
      [   81.840384] CR2: 0000000000000878
      [   81.851731] ---[ end trace 9f6c7232e3464e11 ]---
      ====================
      
      This bug could be triggered by these steps:
      
      modprobe ixgbe ; modprobe macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:00 macvlan0 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:01 macvlan1 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:02 macvlan2 type macvlan
      ip link add link p96p1 address 00:1B:21:6E:06:03 macvlan3 type macvlan
      rmmod ixgbe
      Reported-by: default avatar"Keller, Jacob E" <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e3c516b
  2. 13 Aug, 2014 29 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net · db8d457a
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2014-08-12
      
      This series contains updates to i40e and e1000e.
      
      Lucas provides a fix for i40e to resolve a compile issue where a header
      was missing in the #includes.
      
      Wei Yongjun provides a fix for i40e to resolve a sparse warning, where
      a non-static function should be static.
      
      Julia Lawall provides a fix for i40e which was found using Coccinelle,
      where there was a typo in the name of the type given to sizeof().
      
      Rickard Strandqvist provides a fix for i40e to replace the use of
      strncpy() with strlcpy() to avoid strings that lack null termination.
      
      Jean Sacren provides two e1000e fixes, first is a comment fix and second
      removes an excessive space character in a debug message.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db8d457a
    • David S. Miller's avatar
      Merge branch 'xen-netback-synchronization' · a4688132
      David S. Miller authored
      Wei Liu says:
      
      ====================
      xen-netback: synchronisation between core driver and netback
      
      The zero-copy netback has far more interactions with core network driver than
      the old copying backend. One significant thing is that netback now relies on
      a callback from core driver to correctly release resources.
      
      However correct synchronisation between core driver and netback is missing.
      Currently netback relies on a loop to wait for core driver to release
      resources. This is proven not enough and erroneous recently, partly due to code
      structure, partly due to missing synchronisation. Short-live domains like
      OpenMirage unikernels can easily trigger race in backend, rendering backend
      unresponsive.
      
      This patch series aims to slove this issue by introducing proper
      synchronisation between core driver and netback.
      
      Chagges in v4:
      * avoid using wait queue
      * remove dedicated loop for netif_napi_del
      * remove unnecessary check on callback
      
      Change in v3: improve commit message in patch 1
      
      Change in v2: fix Zoltan's email address in commit message
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4688132
    • Wei Liu's avatar
      xen-netback: remove loop waiting function · b1252858
      Wei Liu authored
      The original implementation relies on a loop to check if all inflight
      packets are freed. Now we have proper reference counting, there's no
      need to use loop anymore.
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1252858
    • Wei Liu's avatar
      xen-netback: don't stop dealloc kthread too early · a64bd934
      Wei Liu authored
      Reference count the number of packets in host stack, so that we don't
      stop the deallocation thread too early. If not, we can end up with
      xenvif_free permanently waiting for deallocation thread to unmap grefs.
      Reported-by: default avatarThomas Leonard <talex5@gmail.com>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a64bd934
    • Wei Liu's avatar
      xen-netback: move NAPI add/remove calls · ea2c5e13
      Wei Liu authored
      Originally netif_napi_add was in xenvif_init_queue and netif_napi_del
      was in xenvif_deinit_queue, while kthreads were handled in
      xenvif_connect and xenvif_disconnect. Move netif_napi_add and
      netif_napi_del to xenvif_connect and xenvif_disconnect so that they
      reside together with kthread operations.
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea2c5e13
    • David S. Miller's avatar
      Merge branch 'xen-netback-debugfs' · 68809958
      David S. Miller authored
      Wei Liu says:
      
      ====================
      xen-netback: fix debugfs code
      
      This small series fixes two problems in xen-netback debugfs code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68809958
    • Wei Liu's avatar
      xen-netback: fix debugfs entry creation · 628fa76b
      Wei Liu authored
      The original code is bogus. The function gets called in a loop which
      leaks entries created in previous rounds.
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      628fa76b
    • Wei Liu's avatar
      xen-netback: fix debugfs write length check · 5c807005
      Wei Liu authored
      Enlarge buffer size and check input length properly, so that we don't
      misuse -ENOSPC.
      
      Note that command like "kickXXXX" is still allowed, that's one patch for
      another day if we really want to be very strict on this.
      Reported-by: default avatarSeeChen Ng <seechen81@gmail.com>
      Signed-off-by: default avatarWei Liu <wei.liu2@citrix.com>
      Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
      Cc: Ian Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c807005
    • Willem de Bruijn's avatar
      net-timestamp: fix missing tcp fragmentation cases · 490cc7d0
      Willem de Bruijn authored
      Bytestream timestamps are correlated with a single byte in the skbuff,
      recorded in skb_shinfo(skb)->tskey. When fragmenting skbuffs, ensure
      that the tskey is set for the fragment in which the tskey falls
      (seqno <= tskey < end_seqno).
      
      The original implementation did not address fragmentation in
      tcp_fragment or tso_fragment. Add code to inspect the sequence numbers
      and move both tskey and the relevant tx_flags if necessary.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      490cc7d0
    • Willem de Bruijn's avatar
      net-timestamp: fix missing ACK timestamp · 712a7221
      Willem de Bruijn authored
      ACK timestamps are generated in tcp_clean_rtx_queue. The TSO datapath
      can break out early, causing the timestamp code to be skipped. Move
      the code up before the break.
      Reported-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      Also fix a boundary condition: tp->snd_una is the next unacknowledged
      byte and between tests inclusive (a <= b <= c), so generate a an ACK
      timestamp if (prior_snd_una <= tskey <= tp->snd_una - 1).
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      712a7221
    • Libo Chen's avatar
      drivers/net/irda/donauboe.c: convert to module_pci_driver · cd094927
      Libo Chen authored
      Signed-off-by: default avatarLibo Chen <libo.chen@huawei.com>
      Cc: Samuel Ortiz <samuel@sortiz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd094927
    • Maks Naumov's avatar
    • Anish Bhatt's avatar
      libcxgbi/cxgb4i : Fix ipv6 build failure caught with randconfig · 8d21797d
      Anish Bhatt authored
      Previous guard of IS_ENABLED(CONFIG_IPV6) is not sufficient when cxgbi drivers
      are built into kernel but ipv6 is not.
      
      v2: Use Kconfig to disable compiling cxgbi built into kernel when ipv6 is
      compiled as a module
      
      Fixes: e81fbf6c ("libcxgbi:cxgb4i Guard ipv6 code with a config check")
      Fixes: fc8d0590 ("libcxgbi: Add ipv6 api to driver")
      Signed-off-by: default avatarAnish Bhatt <anish@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d21797d
    • Govindarajulu Varadarajan's avatar
      tg3: fix return value in tg3_get_stats64 · 7b31b4de
      Govindarajulu Varadarajan authored
      When tp->hw_stats is 0, tg3_get_stats64 should display previously
      recorded stats. So it returns &tp->net_stats_prev. But the caller,
      dev_get_stats, ignores the return value.
      
      Fix this by assigning tp->net_stats_prev to stats and returning stats.
      Signed-off-by: default avatarGovindarajulu Varadarajan <_govind@gmx.com>
      Acked-by: default avatarPrashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b31b4de
    • Sowmini Varadhan's avatar
      sunvnet: Schedule maybe_tx_wakeup() as a tasklet from ldc_rx path · 1d311ad2
      Sowmini Varadhan authored
      At the tail of vnet_event(), if we hit the maybe_tx_wakeup()
      condition, we try to take the netif_tx_lock() in the
      recv-interrupt-context and can deadlock with dev_watchdog().
      vnet_event() should schedule maybe_tx_wakeup() as a tasklet
      to avoid this deadlock
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d311ad2
    • Sowmini Varadhan's avatar
      sunvnet: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN · adddc32d
      Sowmini Varadhan authored
      ldc_rx -> vnet_rx -> .. -> vnet_walk_rx->vnet_send_ack should not
      spin into an infinite loop waiting  EAGAIN to lift.
      
      The sender could have sent us a burst, and gone to lunch without
      doing any more ldc_read()'s. That should not cause the receiver to
      loop infinitely till soft-lockup kicks in.
      
      Similarly __vnet_tx_trigger should only loop on EAGAIN a finite
      number of times. The caller (vnet_start_xmit()) already has code
      to reset the dring state and bail on errors from __vnet_tx_trigger
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarRaghuram Kothakota <raghuram.kothakota@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adddc32d
    • Sowmini Varadhan's avatar
      sunvnet: Do not ask for an ACK for every dring transmit · 1f6394e3
      Sowmini Varadhan authored
      No need to ask for an ack with every vnet_start_xmit()- the single
      ACK with DRING_STOPPED is sufficient for the protocol, and we free
      the sk_buff in vnet_start_xmit itself, so we dont need an ACK back.
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarRaghuram Kothakota <raghuram.kothakota@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f6394e3
    • chas williams - CONTRACTOR's avatar
      lec: Fix bug introduced by b67bfe0d · 8356f9d5
      chas williams - CONTRACTOR authored
      b67bfe0d (hlist: drop the node
      parameter from iterators) dropped the node parameter from
      iterators which lec_tbl_walk() was using to iterate the list.
      Signed-off-by: default avatarChas Williams <chas@cmf.nrl.navy.mil>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8356f9d5
    • chas williams - CONTRACTOR's avatar
      atm/svc: Fix blocking in wait loop · de713b57
      chas williams - CONTRACTOR authored
      One should not call blocking primitives inside a wait loop, since both
      require task_struct::state to sleep, so the inner will destroy the
      outer state.
      
      sigd_enq() will possibly sleep for alloc_skb().  Move sigd_enq() before
      prepare_to_wait() to avoid sleeping while waiting interruptibly.  You do
      not actually need to call sigd_enq() after the initial prepare_to_wait()
      because we test the termination condition before calling schedule().
      
      Based on suggestions from Peter Zijlstra.
      Signed-off-by: default avatarChas Williams <chas@cmf.n4rl.navy.mil>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de713b57
    • Stanislaw Gruszka's avatar
      myri10ge: check for DMA mapping errors · 10545937
      Stanislaw Gruszka authored
      On IOMMU systems DMA mapping can fail, we need to check for
      that possibility.
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10545937
    • Christoph Jaeger's avatar
      openvswitch: Fix memory leak in ovs_vport_alloc() error path · 3791b3f6
      Christoph Jaeger authored
      ovs_vport_alloc() bails out without freeing the memory 'vport' points to.
      
      Picked up by Coverity - CID 1230503.
      
      Fixes: 5cd667b0 ("openvswitch: Allow each vport to have an array of 'port_id's.")
      Signed-off-by: default avatarChristoph Jaeger <cj@linux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3791b3f6
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f0094b28
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Several networking final fixes and tidies for the merge window:
      
         1) Changes during the merge window unintentionally took away the
            ability to build bluetooth modular, fix from Geert Uytterhoeven.
      
         2) Several phy_node reference count bug fixes from Uwe Kleine-König.
      
         3) Fix ucc_geth build failures, also from Uwe Kleine-König.
      
         4) Fix klog false positivies when netlink messages go to network
            taps, by properly resetting the network header.  Fix from Daniel
            Borkmann.
      
         5) Sizing estimate of VF netlink messages is too small, from Jiri
            Benc.
      
         6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian.
      
         7) VLAN untagging is erroneously dependent upon whether the VLAN
            module is loaded or not, but there are generic dependencies that
            matter wrt what can be expected as the SKB enters the stack.
            Make the basic untagging generic code, and do it unconditionally.
            From Vlad Yasevich.
      
         8) xen-netfront only has so many slots in it's transmit queue so
            linearize packets that have too many frags.  From Zoltan Kiss.
      
         9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian
            Fainelli"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits)
        net: bcmgenet: correctly resume adapter from Wake-on-LAN
        net: bcmgenet: update UMAC_CMD only when link is detected
        net: bcmgenet: correctly suspend and resume PHY device
        net: bcmgenet: request and enable main clock earlier
        net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call
        xen-netfront: Fix handling packets on compound pages with skb_linearize
        net: fec: Support phys probed from devicetree and fixed-link
        smsc: replace WARN_ON() with WARN_ON_SMP()
        xen-netback: Don't deschedule NAPI when carrier off
        net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile
        wan: wanxl: Remove typedefs from struct names
        m68k/atari: EtherNEC - ethernet support (ne)
        net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call
        hdlc: Remove typedefs from struct names
        airo_cs: Remove typedef local_info_t
        atmel: Remove typedef atmel_priv_ioctl
        com20020_cs: Remove typedef com20020_dev_t
        ethernet: amd: Remove typedef local_info_t
        net: Always untag vlan-tagged traffic on input.
        drivers: net: Add APM X-Gene SoC ethernet driver support.
        ...
      f0094b28
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 13b102bf
      Linus Torvalds authored
      Pull Sparc fixes from David Miller:
       "Sparc bug fixes, one of which was preventing successful SMP boots with
        mainline"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: Fix pcr_ops initialization and usage bugs.
        sparc64: Do not disable interrupts in nmi_cpu_busy()
        sparc: Hook up seccomp and getrandom system calls.
        sparc: fix decimal printf format specifiers prefixed with 0x
      13b102bf
    • Linus Torvalds's avatar
      Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 81c02a21
      Linus Torvalds authored
      Pull x86/apic updates from Thomas Gleixner:
       "This is a major overhaul to the x86 apic subsystem consisting of the
        following parts:
      
         - Remove obsolete APIC driver abstractions (David Rientjes)
      
         - Use the irqdomain facilities to dynamically allocate IRQs for
           IOAPICs.  This is a prerequisite to enable IOAPIC hotplug support,
           and it also frees up wasted vectors (Jiang Liu)
      
         - Misc fixlets.
      
        Despite the hickup in Ingos previous pull request - caused by the
        missing fixup for the suspend/resume issue reported by Borislav - I
        strongly recommend that this update finds its way into 3.17.  Some
        history for you:
      
        This is preparatory work for physical IOAPIC hotplug.  The first
        attempt to support this was done by Yinghai and I shot it down because
        it just added another layer of obscurity and complexity to the already
        existing mess without tackling the underlying shortcomings of the
        current implementation.
      
        After quite some on- and offlist discussions, I requested that the
        design of this functionality must use generic infrastructure, i.e.
        irq domains, which provide all the mechanisms to dynamically map linux
        interrupt numbers to physical interrupts.
      
        Jiang picked up the idea and did a great job of consolidating the
        existing interfaces to manage the x86 (IOAPIC) interrupt system by
        utilizing irq domains.
      
        The testing in tip, Linux-next and inside of Intel on various machines
        did not unearth any oddities until Borislav exposed it to one of his
        oddball machines.  The issue was resolved quickly, but unfortunately
        the fix fell through the cracks and did not hit the tip tree before
        Ingo sent the pull request.  Not entirely Ingos fault, I also assumed
        that the fix was already merged when Ingo asked me whether he could
        send it.
      
        Nevertheless this work has a proper design, has undergone several
        rounds of review and the final fallout after applying it to tip and
        integrating it into Linux-next has been more than moderate.  It's the
        ground work not only for IOAPIC hotplug, it will also allow us to move
        the lowlevel vector allocation into the irqdomain hierarchy, which
        will benefit other architectures as well.  Patches are posted already,
        but they are on hold for two weeks, see below.
      
        I really appreciate the competence and responsiveness Jiang has shown
        in course of this endavour.  So I'm sure that any fallout of this will
        be addressed in a timely manner.
      
        FYI, I'm vanishing for 2 weeks into my annual kids summer camp kitchen
        duty^Wvacation, while you folks are drooling at KS/LinuxCon :) But HPA
        will have a look at the hopefully zero fallout until I'm back"
      
      * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
        x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation
        x86/apic/vsmp: Make is_vsmp_box() static
        x86, apic: Remove enable_apic_mode callback
        x86, apic: Remove setup_portio_remap callback
        x86, apic: Remove multi_timer_check callback
        x86, apic: Replace noop_check_apicid_used
        x86, apic: Remove check_apicid_present callback
        x86, apic: Remove mps_oem_check callback
        x86, apic: Remove smp_callin_clear_local_apic callback
        x86, apic: Replace trampoline physical addresses with defaults
        x86, apic: Remove x86_32_numa_cpu_node callback
        x86: intel-mid: Use the new io_apic interfaces
        x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
        x86, irq: Clean up irqdomain transition code
        x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled
        x86, irq, SFI: Release IOAPIC pin when PCI device is disabled
        x86, irq, mpparse: Release IOAPIC pin when PCI device is disabled
        x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled
        x86, irq: Introduce helper functions to release IOAPIC pin
        x86, irq: Simplify the way to handle ISA IRQ
        ...
      81c02a21
    • Linus Torvalds's avatar
      Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d27c0d90
      Linus Torvalds authored
      Pull x86/efix fixes from Peter Anvin:
       "Two EFI-related Kconfig changes, which happen to touch immediately
        adjacent lines in Kconfig and thus collapse to a single patch"
      
      * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub
        x86/efi: Fix 3DNow optimization build failure in EFI stub
      d27c0d90
    • Linus Torvalds's avatar
      Merge branch 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7453f33b
      Linus Torvalds authored
      Pull x86/xsave changes from Peter Anvin:
       "This is a patchset to support the XSAVES instruction required to
        support context switch of supervisor-only features in upcoming
        silicon.
      
        This patchset missed the 3.16 merge window, which is why it is based
        on 3.15-rc7"
      
      * 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, xsave: Add forgotten inline annotation
        x86/xsaves: Clean up code in xstate offsets computation in xsave area
        x86/xsave: Make it clear that the XSAVE macros use (%edi)/(%rdi)
        Define kernel API to get address of each state in xsave area
        x86/xsaves: Enable xsaves/xrstors
        x86/xsaves: Call booting time xsaves and xrstors in setup_init_fpu_buf
        x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time
        x86/xsaves: Add xsaves and xrstors support for booting time
        x86/xsaves: Clear reserved bits in xsave header
        x86/xsaves: Use xsave/xrstor for saving and restoring user space context
        x86/xsaves: Use xsaves/xrstors for context switch
        x86/xsaves: Use xsaves/xrstors to save and restore xsave area
        x86/xsaves: Define a macro for handling xsave/xrstor instruction fault
        x86/xsaves: Define macros for xsave instructions
        x86/xsaves: Change compacted format xsave area header
        x86/alternative: Add alternative_input_2 to support alternative with two features and input
        x86/xsaves: Add a kernel parameter noxsaves to disable xsaves/xrstors
      7453f33b
    • Linus Torvalds's avatar
      Merge tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag · fd1cf905
      Linus Torvalds authored
      Pull metag architecture updates from James Hogan:
       "Just a couple of minor static analysis fixes, removal of a NULL check
        that should never happen, and fix an error check where an unsigned
        value was being checked to see if it was negative"
      
      * tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
        metag: cachepart: Fix failure check
        metag: hugetlbpage: Remove null pointer checks that could never happen
      fd1cf905
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 06b8ab55
      Linus Torvalds authored
      Pull NFS client updates from Trond Myklebust:
       "Highlights include:
      
         - stable fix for a bug in nfs3_list_one_acl()
         - speed up NFS path walks by supporting LOOKUP_RCU
         - more read/write code cleanups
         - pNFS fixes for layout return on close
         - fixes for the RCU handling in the rpcsec_gss code
         - more NFS/RDMA fixes"
      
      * tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
        nfs: reject changes to resvport and sharecache during remount
        NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
        SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
        NFS: fix two problems in lookup_revalidate in RCU-walk
        NFS: allow lockless access to access_cache
        NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
        NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
        NFS: support RCU_WALK in nfs_permission()
        sunrpc/auth: allow lockless (rcu) lookup of credential cache.
        NFS: prepare for RCU-walk support but pushing tests later in code.
        NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
        NFS: add checks for returned value of try_module_get()
        nfs: clear_request_commit while holding i_lock
        pnfs: add pnfs_put_lseg_async
        pnfs: find swapped pages on pnfs commit lists too
        nfs: fix comment and add warn_on for PG_INODE_REF
        nfs: check wait_on_bit_lock err in page_group_lock
        sunrpc: remove "ec" argument from encrypt_v2 operation
        sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
        sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
        ...
      06b8ab55
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs · dc1cc851
      Linus Torvalds authored
      Pull xfs update from Dave Chinner:
       "This update contains:
         - conversion of the XFS core to pass negative error numbers
         - restructing of core XFS code that is shared with userspace to
           fs/xfs/libxfs
         - introduction of sysfs interface for XFS
         - bulkstat refactoring
         - demand driven speculative preallocation removal
         - XFS now always requires 64 bit sectors to be configured
         - metadata verifier changes to ensure CRCs are calculated during log
           recovery
         - various minor code cleanups
         - miscellaneous bug fixes
      
        The diffstat is kind of noisy because of the restructuring of the code
        to make kernel/userspace code sharing simpler, along with the XFS wide
        change to use the standard negative error return convention (at last!)"
      
      * tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs: (45 commits)
        xfs: fix coccinelle warnings
        xfs: flush both inodes in xfs_swap_extents
        xfs: fix swapext ilock deadlock
        xfs: kill xfs_vnode.h
        xfs: kill VN_MAPPED
        xfs: kill VN_CACHED
        xfs: kill VN_DIRTY()
        xfs: dquot recovery needs verifiers
        xfs: quotacheck leaves dquot buffers without verifiers
        xfs: ensure verifiers are attached to recovered buffers
        xfs: catch buffers written without verifiers attached
        xfs: avoid false quotacheck after unclean shutdown
        xfs: fix rounding error of fiemap length parameter
        xfs: introduce xfs_bulkstat_ag_ichunk
        xfs: require 64-bit sector_t
        xfs: fix uflags detection at xfs_fs_rm_xquota
        xfs: remove XFS_IS_OQUOTA_ON macros
        xfs: tidy up xfs_set_inode32
        xfs: allow inode allocations in post-growfs disk space
        xfs: mark xfs_qm_quotacheck as static
        ...
      dc1cc851