1. 26 Jan, 2015 14 commits
  2. 25 Jan, 2015 26 commits
    • David S. Miller's avatar
      Merge branch 'phy_dsa' · 5c66cfe0
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: phy and dsa random fixes/cleanups
      
      These two patches were already present as part of my attempt to make
      DSA modules work properly, these are the only two "valid" patches at
      this point which should not need any further rework.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c66cfe0
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: factor interrupt disabling in a function · 691c9a8f
      Florian Fainelli authored
      Factor the interrupt disabling in a function: bcm_sf2_intr_disable()
      since we are doing the same thing in the setup and suspend paths.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      691c9a8f
    • Florian Fainelli's avatar
      net: phy: fixed: allow setting no update_link callback · 799d4444
      Florian Fainelli authored
      fixed_phy_set_link_update() contains an early check against a NULL
      callback pointer, which basically prevents us from removing any
      previous callback we may have set. The users of the fp->link_update
      callback deal with a NULL callback just fine, so we really want to allow
      "removing" a link_update callback to avoid dangling callback pointers
      during e.g: module removal.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      799d4444
    • Harout Hedeshian's avatar
      net: ipv6: Add sysctl entry to disable MTU updates from RA · c2943f14
      Harout Hedeshian authored
      The kernel forcefully applies MTU values received in router
      advertisements provided the new MTU is less than the current. This
      behavior is undesirable when the user space is managing the MTU. Instead
      a sysctl flag 'accept_ra_mtu' is introduced such that the user space
      can control whether or not RA provided MTU updates should be applied. The
      default behavior is unchanged; user space must explicitly set this flag
      to 0 for RA MTUs to be ignored.
      Signed-off-by: default avatarHarout Hedeshian <harouth@codeaurora.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2943f14
    • David S. Miller's avatar
      Merge branch 'fib_trie_next' · 46a93af2
      David S. Miller authored
      Alexander Duyck says:
      
      ====================
      Fixes and improvements for recent fib_trie updates
      
      While performing testing and prepping the next round of patches I found a
      few minor issues and improvements that could be made.
      
      These changes should help to reduce the overall code size and improve the
      performance slighlty as I noticed a 20ns or so improvement in my worst-case
      testing which will likely only result in a 1ns difference with a standard
      sized trie.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46a93af2
    • Alexander Duyck's avatar
      fib_trie: Various clean-ups for handling slen · 64c62723
      Alexander Duyck authored
      While doing further work on the fib_trie I noted a few items.
      
      First I was using calls that were far more complicated than they needed to
      be for determining when to push/pull the suffix length.  I have updated the
      code to reflect the simplier logic.
      
      The second issue is that I realised we weren't necessarily handling the
      case of a leaf_info struct surviving a flush.  I have updated the logic so
      that now we will call pull_suffix in the event of having a leaf info value
      left in the leaf after flushing it.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64c62723
    • Alexander Duyck's avatar
      fib_trie: Move fib_find_alias to file where it is used · 02525368
      Alexander Duyck authored
      The function fib_find_alias is only accessed by functions in fib_trie.c as
      such it makes sense to relocate it and cast it as static so that the
      compiler can take advantage of optimizations it can do to it as a local
      function.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02525368
    • Alexander Duyck's avatar
      fib_trie: Use empty_children instead of counting empty nodes in stats collection · 30cfe7c9
      Alexander Duyck authored
      It doesn't make much sense to count the pointers ourselves when
      empty_children already has a count for the number of NULL pointers stored
      in the tnode.  As such save ourselves the cycles and just use
      empty_children.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30cfe7c9
    • Alexander Duyck's avatar
      fib_trie: Add collapse() and should_collapse() to resize · 95f60ea3
      Alexander Duyck authored
      This patch really does two things.
      
      First it pulls the logic for determining if we should collapse one node out
      of the tree and the actual code doing the collapse into a separate pair of
      functions.  This helps to make the changes to these areas more readable.
      
      Second it encodes the upper 32b of the empty_children value onto the
      full_children value in the case of bits == KEYLENGTH.  By doing this we are
      able to handle the case of a 32b node where empty_children would appear to
      be 0 when it was actually 1ul << 32.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95f60ea3
    • Alexander Duyck's avatar
      fib_trie: Fall back to slen update on inflate/halve failure · a80e89d4
      Alexander Duyck authored
      This change corrects an issue where if inflate or halve fails we were
      exiting the resize function without at least updating the slen for the
      node.  To correct this I have moved the update of max_size into the while
      loop so that it is only decremented on a successful call to either inflate
      or halve.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a80e89d4
    • Alexander Duyck's avatar
      fib_trie: Fix RCU bug and merge similar bits of inflate/halve · 69fa57b1
      Alexander Duyck authored
      This patch addresses two issues.
      
      The first issue is the fact that I believe I had the RCU freeing sequence
      slightly out of order.  As a result we could get into an issue if a caller
      went into a child of a child of the new node, then backtraced into the to be
      freed parent, and then attempted to access a child of a child that may have
      been consumed in a resize of one of the new nodes children.  To resolve this I
      have moved the resize after we have freed the oldtnode.  The only side effect
      of this is that we will now be calling resize on more nodes in the case of
      inflate due to the fact that we don't have a good way to test to see if a
      full_tnode on the new node was there before or after the allocation.  This
      should have minimal impact however since the node should already be
      correctly size so it is just the cost of calling should_inflate that we
      will be taking on the node which is only a couple of cycles.
      
      The second issue is the fact that inflate and halve were essentially doing
      the same thing after the new node was added to the trie replacing the old
      one.  As such it wasn't really necessary to keep the code in both functions
      so I have split it out into two other functions, called replace and
      update_children.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69fa57b1
    • Alexander Duyck's avatar
      fib_trie: Use index & (~0ul << n->bits) instead of index >> n->bits · b3832117
      Alexander Duyck authored
      In doing performance testing and analysis of the changes I recently found
      that by shifting the index I had created an unnecessary dependency.
      
      I have updated the code so that we instead shift a mask by bits and then
      just test against that as that should save us about 2 CPU cycles since we
      can generate the mask while the key and pos are being processed.
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3832117
    • David S. Miller's avatar
      Merge branch 'mlx4-next' · bc579ae5
      David S. Miller authored
      Or Gerlitz says:
      
      ====================
      mlx4: Fix and enhance the device reset flow
      
      This series from Yishai Hadas fixes the device reset flow and adds SRIOV support.
      
      Reset flows are required whenever a device experiences errors, is unresponsive,
      or is not in a deterministic state. In such cases, the driver is expected to
      reset the HW and continue operation. When SRIOV is enabled, these requirements
      apply both to PF and VF devices.
      
      Currently, the mlx4 reset flow doesn't work properly: when a fatal error is
      detected on the FW internal buffer the chip is not reset and stays in its
      bad state. There are cases that assumed to be fatal such as non-responsive FW,
      errors via closing commands but are not handled today.
      
      The AER mechanism should also be fixed:
      - It should use mlx4_load_one instead of __mlx4_init_one which is done
        upon HCA probing.
      - It must be aligned with concurrent catas flow, mark device to be in
        an error state, reset chip, etc.
      - Port types should be restored to their original values before error occurred.
      
      In addition, there the SRIOV use-case isn't supported.
      
      In above cases when the device state becomes fatal we must act as follows:
      1) Reset the chip and mark the HW device state as in fatal error.
      2) Wake up any pending commands, preventing new ones to come in.
      3) Restart the software stack.
      
      We also address the SRIOV mode as follows: In case the PF detects a fatal error,
      it lets VFs know about that, then both itself and VFs are restarted asynchronously.
      However, in case only the VF encountered a fatal case or forced to be reset, they
      reset the VF stuff and then restart software.
      
      changes from V0:
      
      No need to call pci_disable_device upon permanent PCI error. This will
      be done as part of mlx4_remove_one which is called later once we
      return PCI_ERS_RESULT_DISCONNECT from the pci error handler.
      
      Initial toggle value should use only the T bit and not the whole byte value.
      Not doing so sometimes broke SRIOV as of junky value seen by the VF as a
      non-ready comm channel
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc579ae5
    • Yishai Hadas's avatar
      net/mlx4_core: Reset flow activation upon SRIOV fatal command cases · 0cd93027
      Yishai Hadas authored
      When SRIOV commands are executed over the comm-channel and get
      a fatal error (e.g. timeout, closing command failure) the VF enters
      into error state and reset flow is activated.
      
      To be able to recognize whether the failure was on a closing command, the
      operational code for the given VHCR command is used. Once the device entered
      into an error state we prevent redundant error messages from being printed.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0cd93027
    • Yishai Hadas's avatar
      net/mlx4_core: Enable device recovery flow with SRIOV · 55ad3592
      Yishai Hadas authored
      In SRIOV, both the PF and the VF may attempt device recovery whenever they
      assume that the device is not functioning.  When the PF driver resets the
      device, the VF should detect this and attempt to reinitialize itself.
      
      The VF must be able to reset itself under all circumstances, even
      if the PF is not responsive.
      
      The VF shall reset itself in the following cases:
      
      1. Commands are not processed within reasonable time over the communication channel.
      This is done considering device state and the correct return code based on
      the command as was done in the native mode, done in the next patch.
      
      2. The VF driver receives an internal error event reported by the PF on the
      communication channel. This occurs when the PF driver resets the device or
      when VF is out of sync with the PF.
      
      Add 'VF reset' capability, which allows the VF to reinitialize itself even when the
      PF is not responsive.
      
      As PF and VF may run their reset flow simulantanisly, there are several cases
      that are handled:
      - Prevent freeing VF resources upon FLR, when PF is in its unloading stage.
      - Prevent PF getting VF commands before it has finished initializing its resources.
      - Upon VF startup, check that comm-channel is online before sending
        commands to the PF and getting timed-out.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55ad3592
    • Yishai Hadas's avatar
      net/mlx4_core: Handle AER flow properly · 2ba5fbd6
      Yishai Hadas authored
      Fix AER callbacks to work properly, it includes:
      - Refractoring AER to be aligned with Reset flow support.
      - Sync with concurrent catas flow.
      
      In addition, fix the shutdown PCI callback to sync with
      concurrent catas flow.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ba5fbd6
    • Yishai Hadas's avatar
      net/mlx4_core: Manage interface state for Reset flow cases · c69453e2
      Yishai Hadas authored
      We need to manage interface state to sync between reset flow and some other
      relative cases such as remove_one. This has to be done to prevent certain
      races. For example in case software stack is down as a result of unload call,
      the remove_one should skip the unload phase.
      
      Implement the remove_one case, handling AER and other cases comes next.
      
      The interface can be up/down, upon remove_one, the state will include an extra
      bit indicating that the device is cleaned-up, forcing other tasks to finish
      before the final cleanup.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c69453e2
    • Yishai Hadas's avatar
      net/mlx4_core: Activate reset flow upon fatal command cases · f5aef5aa
      Yishai Hadas authored
      We activate reset flow upon command fatal errors, when the device enters an
      erroneous state, and must be reset.
      
      The cases below are assumed to be fatal: FW command timed-out, an error from FW
      on closing commands, pci is offline when posting/pending a command.
      
      In those cases we place the device into an error state: chip is reset, pending
      commands are awakened and completed immediately. Subsequent commands will
      return immediately.
      
      The return code in the above cases will depend on the command. Commands which
      free and close resources will return success (because the chip was reset, so
      callers may safely free their kernel resources). Other commands will return -EIO.
      
      Since the device's state was marked as error, the catas poller will
      detect this and restart the device's software stack (as is done when a FW
      internal error is directly detected). The device state is protected by a
      persistent mutex lives on its mlx4_dev, as such no need any more for the
      hcr_mutex which is removed.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5aef5aa
    • Yishai Hadas's avatar
      net/mlx4_core: Enhance the catas flow to support device reset · f6bc11e4
      Yishai Hadas authored
      This includes:
      
      - resetting the chip when a fatal error is detected (the current code
        does not do this).
      
      - exposing the ability to enter error state from outside the catas code
        by calling its functionality. (E.g. FW Command timeout, AER error).
      
      - managing a persistent device state. This is needed to sync between
        reset flow cases.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6bc11e4
    • Yishai Hadas's avatar
      net/mlx4_core: Refactor the catas flow to work per device · ad9a0bf0
      Yishai Hadas authored
      Using a WQ per device instead of a single global WQ, this allows
      independent reset handling per device even when SRIOV is used.
      
      This comes as a pre-patch for supporting chip reset
      for both native and SRIOV.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad9a0bf0
    • Yishai Hadas's avatar
      net/mlx4_core: Set device configuration data to be persistent across reset · dd0eefe3
      Yishai Hadas authored
      When an HCA enters an internal error state, this is detected by the driver.
      The driver then should reset the HCA and restart the software stack.
      
      Keep ports information and some SRIOV configuration in a persistent area
      to have it valid across reset.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd0eefe3
    • Yishai Hadas's avatar
      net/mlx4_core: Maintain a persistent memory for mlx4 device · 872bf2fb
      Yishai Hadas authored
      Maintain a persistent memory that should survive reset flow/PCI error.
      This comes as a preparation for coming series to support above flows.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      872bf2fb
    • Andy Shevchenko's avatar
      cxgb3: re-use native hex2bin() · 7aee42c6
      Andy Shevchenko authored
      Call hex2bin() library function instead of doing conversion here.
      Signed-off-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7aee42c6
    • Andy Shevchenko's avatar
      usbnet: re-use native hex2bin() · 51487ae7
      Andy Shevchenko authored
      Call hex2bin() library function, instead of doing conversion here.
      Signed-off-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Acked-by: default avatarOliver Neukum <oneukum@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51487ae7
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · bc0247a4
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2015-01-22
      
      This series contains updates to e1000, e1000e, igb, fm10k and virtio_net.
      
      Asaf Vertz provides a fix for e1000 to future-proof the time comparisons
      by using time_after_eq() instead of plain math.
      
      Mathias Koehrer provides a fix for e1000e to add a check to e1000_xmit_frame()
      to ensure a work queue will not be scheduled that has not been initialized.
      
      Jacob adds the use of software timestamping via the virtio_net driver.
      
      Alex Duyck cleans up page reuse code in igb and fm10k.  Cleans up the
      page reuse code from getting into a state where all the workarounds
      needed are in place as well as cleaning up oversights, such as using
      __free_pages instead of put_page to drop a locally allocated page.
      
      Richard Cochran provides 4 patches for igb dealing with time sync.
      First provides a helper function since the code that handles the time
      sync interrupt is repeated in three different places.  Then serializes
      the access to the time sync interrupt since the registers may be
      manipulated from different contexts.  Enables the use of i210 device
      interrupt to generate an internal PPS event for adjusting the kernel
      system time.  The i210 device offers a number of special PTP hardware
      clock features on the Software Defined Pins (SDPs), so added support for
      two of the possible functions (time stamping external events and
      periodic output signals).
      
      Or Gerlitz fixes fm10k from double setting of NETIF_F_SG since the
      networking core does it for the driver during registration time.
      
      Joe Stringer adds support for up to 104 bytes of inner+outer headers in
      fm10k and adds an initial check to fail encapsulation offload if these
      are too large.
      
      Matthew increases the timeout for the data path reset based on feedback
      from the hardware team, since 100us is too short of a time to wait for
      the data path reset to complete.
      
      Alexander Graf provides a fix for igb to indicate failure on VF reset
      for an empty MAC address, to mirror the behavior of ixgbe.
      
      Florian Westphal updates e1000 and e1000e to support txtd update delay
      via xmit_more, this way we won't update the Tx tail descriptor if the
      queue has not been stopped and we know at least one more skb will be
      sent right away.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc0247a4
    • David S. Miller's avatar
      Merge branch 'vxlan_tx' · 86b368b4
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      vxlan: Don't use UDP socket for transmit
      
      UDP socket is not pertinent to transmit for UDP tunnels, checksum
      enablement can be done without a socket. This patch set eliminates
      reference to a socket in udp_tunnel_xmit functions and in VXLAN
      transmit.
      
      Also, make GBP, RCO, can CSUM6_RX flags visible to receive socket
      and only match these for shareable socket.
      
      v2: Fix geneve to call udp_tunnel_xmit with good arguments.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86b368b4