1. 25 Apr, 2016 11 commits
  2. 24 Apr, 2016 9 commits
    • Sridhar Samudrala's avatar
      ixgbe: make 'action' field in struct ixgbe_fdir_filter a u64 value · 2a9ed5d1
      Sridhar Samudrala authored
      This field is used to record the RX queue index for a redirect action
      passed via ring_cookie field in struct ethtool_rx_flow_spec which is
      a u64 value.
      
      For ex: after adding a filter rule to redirect to a VF using ethtool
        # echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
        # ethtool -N p4p1 flow-type ip4 src-ip 192.168.0.1 action 0x100000000
      
      querying for the rule shows the Action as 'Direct to queue 0'
      
        # ethtool -n p4p1
        4 RX rings available
        Total 1 rules
      
        Filter: 2045
       	Rule Type: Raw IPv4
      	Src IP addr: 192.168.0.1 mask: 0.0.0.0
      	Dest IP addr: 0.0.0.0 mask: 255.255.255.255
      	TOS: 0x0 mask: 0xff
      	Protocol: 0 mask: 0xff
      	L4 bytes: 0x0 mask: 0xffffffff
      	VLAN EtherType: 0x0 mask: 0xffff
      	VLAN: 0x0 mask: 0xffff
      	User-defined: 0x0 mask: 0xffffffffffffffff
      	Action: Direct to queue 0
      
      With this fix, ethtool will report the right queue index even for VFs.
      	Action: Direct to queue 4294967296
      
      Here 4294967296 corresponds to 0x100000000.
      We need to update 'ethtool' to report the queue index as a Hex value so
      that it is more  user friendly and matches with the 'action' value that
      is passed when adding the rule.
      Signed-off-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      2a9ed5d1
    • Emil Tantilov's avatar
      ixgbe: fix default mac->ops.setup_link for X550EM · 4695886c
      Emil Tantilov authored
      X550EM_a/x did not have a default value for mac->ops.setup_link which
      was causing link issues for backplane devices.
      
      This patch sets mac->ops.setup_link to ixgbe_setup_mac_link_X540 for
      X550EM_a/x which is also default for X550. This will result in
      mac->ops.setup_link calling the link setup function for the respective
      PHY type in case we do not need a special function to deal with it.
      Reported-by: default avatarKen Cox <jkc@redhat.com>
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4695886c
    • Emil Tantilov's avatar
      ixgbe: set VLAN spoof checking unconditionally · d3dec7c7
      Emil Tantilov authored
      Previously the PF driver would only set VLAN spoof checking if
      the VF had created VLANs. This was done by setting and checking
      a counter (vlan_count) whenever a VLAN was created by the VF.
      However it is possible for the vlan_count to be !=0 while there are
      no VLANs assigned to the VF due to the count incrementing every
      time a VLAN 0 is added on ifdown/up, which resulted in VLAN spoofing
      always being set for those VFs.
      
      This patch cleans up the logic by unconditionally setting VLAN based on
      how the VF is configured (via ip link set ethX vf Y spoofchk on/off).
      This change also resolves an issue where the VLAN spoofing can remain
      set even after being disabled by the user due to the driver enabling
      VLAN spoof checking every time a VLAN is added to the VF, but would
      only allow changes in the setting if vlan_count != 0.
      
      Also default_vf_vlan_id and vlans_enabled were removed from the
      vf_data_storage structure since they are not being used in the driver.
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      d3dec7c7
    • Emil Tantilov's avatar
      ixgbe: consolidate the configuration of spoof checking · 77f192af
      Emil Tantilov authored
      Consolidate the logic behind configuring spoof checking:
      
      Move the setting of the MAC, VLAN and Ethertype spoof checking into
      ixgbe_ndo_set_vf_spoofchk().
      
      Change ixgbe_set_mac_anti_spoofing() to set MAC spoofing per VF similar
      to the VLAN and Ethertype functions - this allows us to call the helper
      functions in ixgbe_ndo_set_vf_spoofchk() for all spoof check types and
      only disable MAC spoof checking when creating MACVLAN.
      Signed-off-by: default avatarEmil Tantilov <emil.s.tantilov@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      77f192af
    • Eric Dumazet's avatar
      tcp-tso: do not split TSO packets at retransmit time · 10d3be56
      Eric Dumazet authored
      Linux TCP stack painfully segments all TSO/GSO packets before retransmits.
      
      This was fine back in the days when TSO/GSO were emerging, with their
      bugs, but we believe the dark age is over.
      
      Keeping big packets in write queues, but also in stack traversal
      has a lot of benefits.
       - Less memory overhead, because write queues have less skbs
       - Less cpu overhead at ACK processing.
       - Better SACK processing, as lot of studies mentioned how
         awful linux was at this ;)
       - Less cpu overhead to send the rtx packets
         (IP stack traversal, netfilter traversal, drivers...)
       - Better latencies in presence of losses.
       - Smaller spikes in fq like packet schedulers, as retransmits
         are not constrained by TCP Small Queues.
      
      1 % packet losses are common today, and at 100Gbit speeds, this
      translates to ~80,000 losses per second.
      Losses are often correlated, and we see many retransmit events
      leading to 1-MSS train of packets, at the time hosts are already
      under stress.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10d3be56
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix stale links after re-enabling bearer · 8cee83dd
      Parthasarathy Bhuvaragan authored
      Commit 42b18f60 ("tipc: refactor function tipc_link_timeout()"),
      introduced a bug which prevents sending of probe messages during
      link synchronization phase. This leads to hanging links, if the
      bearer is disabled/enabled after links are up.
      
      In this commit, we send the probe messages correctly.
      
      Fixes: 42b18f60 ("tipc: refactor function tipc_link_timeout()")
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cee83dd
    • David S. Miller's avatar
      Merge branch 'tcp-tcstamp_ack-frag-coalesce' · 6a74c196
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      tcp: Handle txstamp_ack when fragmenting/coalescing skbs
      
      This patchset is to handle the txstamp-ack bit when
      fragmenting/coalescing skbs.
      
      The second patch depends on the recently posted series
      for the net branch:
      "tcp: Merge timestamp info when coalescing skbs"
      
      A BPF prog is used to kprobe to sock_queue_err_skb()
      and print out the value of serr->ee.ee_data.  The BPF
      prog (run-able from bcc) is attached here:
      
      BPF prog used for testing:
      ~~~~~
      
      from __future__ import print_function
      from bcc import BPF
      
      bpf_text = """
      
      int trace_err_skb(struct pt_regs *ctx)
      {
      	struct sk_buff *skb = (struct sk_buff *)ctx->si;
      	struct sock *sk = (struct sock *)ctx->di;
      	struct sock_exterr_skb *serr;
      	u32 ee_data = 0;
      
      	if (!sk || !skb)
      		return 0;
      
      	serr = SKB_EXT_ERR(skb);
      	bpf_probe_read(&ee_data, sizeof(ee_data), &serr->ee.ee_data);
      	bpf_trace_printk("ee_data:%u\\n", ee_data);
      
      	return 0;
      };
      """
      
      b = BPF(text=bpf_text)
      b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb")
      print("Attached to kprobe")
      b.trace_print()
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a74c196
    • Martin KaFai Lau's avatar
      tcp: Merge txstamp_ack in tcp_skb_collapse_tstamp · 2de8023e
      Martin KaFai Lau authored
      When collapsing skbs, txstamp_ack also needs to be merged.
      
      Retrans Collapse Test:
      ~~~~~~
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      0.200 write(4, ..., 730) = 730
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 730) = 730
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      0.200 write(4, ..., 11680) = 11680
      
      0.200 > P. 1:731(730) ack 1
      0.200 > P. 731:1461(730) ack 1
      0.200 > . 1461:8761(7300) ack 1
      0.200 > P. 8761:13141(4380) ack 1
      
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:2921,nop,nop>
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:4381,nop,nop>
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:5841,nop,nop>
      0.300 > P. 1:1461(1460) ack 1
      0.400 < . 1:1(0) ack 13141 win 257
      
      BPF Output Before:
      ~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~
      <...>-2027  [007] d.s.    79.765921: : ee_data:1459
      
      Sacks Collapse Test:
      ~~~~~
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      0.200 write(4, ..., 1460) = 1460
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 13140) = 13140
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      
      0.200 > P. 1:1461(1460) ack 1
      0.200 > . 1461:8761(7300) ack 1
      0.200 > P. 8761:14601(5840) ack 1
      
      0.300 < . 1:1(0) ack 1 win 257 <sack 1461:14601,nop,nop>
      0.300 > P. 1:1461(1460) ack 1
      0.400 < . 1:1(0) ack 14601 win 257
      
      BPF Output Before:
      ~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~
      <...>-2049  [007] d.s.    89.185538: : ee_data:14599
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2de8023e
    • Martin KaFai Lau's avatar
      tcp: Carry txstamp_ack in tcp_fragment_tstamp · b51e13fa
      Martin KaFai Lau authored
      When a tcp skb is sliced into two smaller skbs (e.g. in
      tcp_fragment() and tso_fragment()),  it does not carry
      the txstamp_ack bit to the newly created skb if it is needed.
      The end result is a timestamping event (SCM_TSTAMP_ACK) will
      be missing from the sk->sk_error_queue.
      
      This patch carries this bit to the new skb2
      in tcp_fragment_tstamp().
      
      BPF Output Before:
      ~~~~~~
      <No output due to missing SCM_TSTAMP_ACK timestamp>
      
      BPF Output After:
      ~~~~~~
      <...>-2050  [000] d.s.   100.928763: : ee_data:14599
      
      Packetdrill Script:
      ~~~~~~
      +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10`
      +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1`
      +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
      
      +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0
      0.200 write(4, ..., 14600) = 14600
      +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0
      
      0.200 > . 1:7301(7300) ack 1
      0.200 > P. 7301:14601(7300) ack 1
      
      0.300 < . 1:1(0) ack 14601 win 257
      
      0.300 close(4) = 0
      0.300 > F. 14601:14601(0) ack 1
      0.400 < F. 1:1(0) ack 16062 win 257
      0.400 > . 14602:14602(0) ack 2
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Tested-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b51e13fa
  3. 23 Apr, 2016 12 commits
  4. 21 Apr, 2016 8 commits
    • Linus Torvalds's avatar
      Merge tag 'rtc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 5f44abd0
      Linus Torvalds authored
      Pull RTC fixes from Alexandre Belloni:
       "A few fixes for the RTC subsystem.  The documentation fix already
        missed 4.5 so I think it is worth taking it now:
      
        A documentation fix for s3c and two fixes for the ds1307"
      
      * tag 'rtc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        rtc: ds1307: Use irq when available for wakeup-source device
        rtc: ds1307: ds3231 temperature s16 overflow
        rtc: s3c: Document in binding that only s3c6410 needs a src clk
      5f44abd0
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f78fe081
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "Two fixes for issues introduced recently, one for an intel_pstate
        driver problem uncovered by the recent switch over from using timers
        and the other one for a potential cpufreq core problem related to
        system suspend/resume.
      
        Specifics:
      
         - Fix an intel_pstate driver problem causing CPUs to get stuck in the
           highest P-state when completely idle uncovered by the recent switch
           over from using timers (Rafael Wysocki).
      
         - Avoid attempts to get the current CPU frequency when all devices
           (like I2C controllers that may be nedded for that purpose) have
           been suspended during system suspend/resume (Rafael Wysocki)"
      
      * tag 'pm+acpi-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set
        intel_pstate: Avoid getting stuck in high P-states when idle
      f78fe081
    • Nishanth Menon's avatar
      rtc: ds1307: Use irq when available for wakeup-source device · 38a7a73e
      Nishanth Menon authored
      With commit 8bc2a407 ("rtc: ds1307: add support for the
      DT property 'wakeup-source'") we lost the ability for rtc irq
      functionality for devices that are actually hooked on a real IRQ
      line and have capability to wakeup as well. This is not an expected
      behavior. So, instead of just not requesting IRQ, skip the IRQ
      requirement only if interrupts are not defined for the device.
      
      Fixes: 8bc2a407 ("rtc: ds1307: add support for the DT property 'wakeup-source'")
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Cc: Michael Lange <linuxstuff@milaw.biz>
      Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: default avatarNishanth Menon <nm@ti.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      38a7a73e
    • Zhuang Yuyao's avatar
      rtc: ds1307: ds3231 temperature s16 overflow · 9a3dce62
      Zhuang Yuyao authored
      while retrieving temperature from ds3231, the result may be overflow
      since s16 is too small for a multiplication with 250.
      
      ie. if temp_buf[0] == 0x2d, the result (s16 temp) will be negative.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: default avatarMichael Tatarinov <kukabu@gmail.com>
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      9a3dce62
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c5edde3a
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix memory leak in iwlwifi, from Matti Gottlieb.
      
       2) Add missing registration of netfilter arp_tables into initial
          namespace, from Florian Westphal.
      
       3) Fix potential NULL deref in DecNET routing code.
      
       4) Restrict NETLINK_URELEASE to truly bound sockets only, from Dmitry
          Ivanov.
      
       5) Fix dst ref counting in VRF, from David Ahern.
      
       6) Fix TSO segmenting limits in i40e driver, from Alexander Duyck.
      
       7) Fix heap leak in PACKET_DIAG_MCLIST, from Mathias Krause.
      
       8) Ravalidate IPV6 datagram socket cached routes properly, particularly
          with UDP, from Martin KaFai Lau.
      
       9) Fix endian bug in RDS dp_ack_seq handling, from Qing Huang.
      
      10) Fix stats typing in bcmgenet driver, from Eric Dumazet.
      
      11) Openvswitch needs to orphan SKBs before ipv6 fragmentation handing,
          from Joe Stringer.
      
      12) SPI device reference leak in spi_ks8895 PHY driver, from Mark Brown.
      
      13) atl2 doesn't actually support scatter-gather, so don't advertise the
          feature.  From Ben Hucthings.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits)
        openvswitch: use flow protocol when recalculating ipv6 checksums
        Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets
        atl2: Disable unimplemented scatter/gather feature
        net/mlx4_en: Split SW RX dropped counter per RX ring
        net/mlx4_core: Don't allow to VF change global pause settings
        net/mlx4_core: Avoid repeated calls to pci enable/disable
        net/mlx4_core: Implement pci_resume callback
        net: phy: spi_ks8895: Don't leak references to SPI devices
        net: ethernet: davinci_emac: Fix platform_data overwrite
        net: ethernet: davinci_emac: Fix Unbalanced pm_runtime_enable
        qede: Fix single MTU sized packet from firmware GRO flow
        qede: Fix setting Skb network header
        qede: Fix various memory allocation error flows for fastpath
        tcp: Merge tx_flags and tskey in tcp_shifted_skb
        tcp: Merge tx_flags and tskey in tcp_collapse_retrans
        drivers: net: cpsw: fix wrong regs access in cpsw_ndo_open
        tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks
        openvswitch: Orphan skbs before IPv6 defrag
        Revert "Prevent NUll pointer dereference with two PHYs on cpsw"
        VSOCK: Only check error on skb_recv_datagram when skb is NULL
        ...
      c5edde3a
    • David S. Miller's avatar
      Merge branch 'geneve-vxlan-deps' · 22d37b6b
      David S. Miller authored
      Hannes Frederic Sowa says:
      
      ====================
      net: network drivers should not depend on geneve/vxlan
      
      This patchset removes the dependency of network drivers on vxlan or
      geneve, so those don't get autoloaded when the nic driver is loaded.
      
      Also audited the code such that vxlan_get_rx_port and geneve_get_rx_port
      are not called without rtnl lock.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22d37b6b
    • Hannes Frederic Sowa's avatar
      geneve: break dependency with netdev drivers · 681e683f
      Hannes Frederic Sowa authored
      Equivalent to "vxlan: break dependency with netdev drivers", don't
      autoload geneve module in case the driver is loaded. Instead make the
      coupling weaker by using netdevice notifiers as proxy.
      
      Cc: Jesse Gross <jesse@kernel.org>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      681e683f
    • Hannes Frederic Sowa's avatar
      vxlan: break dependency with netdev drivers · b7aade15
      Hannes Frederic Sowa authored
      Currently all drivers depend and autoload the vxlan module because how
      vxlan_get_rx_port is linked into them. Remove this dependency:
      
      By using a new event type in the netdevice notifier call chain we proxy
      the request from the drivers to flush and resetup the vxlan ports not
      directly via function call but by the already existing netdevice
      notifier call chain.
      
      I added a separate new event type, NETDEV_OFFLOAD_PUSH_VXLAN, to do so.
      We don't need to save those ids, as the event type field is an unsigned
      long and using specialized event types for this purpose seemed to be a
      more elegant way. This also comes in beneficial if in future we want to
      add offloading knobs for vxlan.
      
      Cc: Jesse Gross <jesse@kernel.org>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7aade15