1. 22 Feb, 2013 1 commit
    • Ben Pfaff's avatar
      openvswitch: Allow OVS_USERSPACE_ATTR_USERDATA to be variable length. · 4490108b
      Ben Pfaff authored
      Until now, the optional OVS_USERSPACE_ATTR_USERDATA attribute had to be
      exactly 64 bits long, if it was present.  However, 64 bits is not enough
      space to associate as much information with a flow as would be convenient
      for some userspace features now under development.  This commit generalizes
      the attribute, allowing it to be any length.
      This generalization is backward-compatible: if userspace only uses 64-bit
      attributes, then it will not see any change in behavior.
      CC: Romain Lenglet <rlenglet@vmware.com>
      Signed-off-by: default avatarBen Pfaff <blp@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
  2. 07 Jan, 2013 1 commit
  3. 03 Dec, 2012 2 commits
    • Michael S. Tsirkin's avatar
      tun: only queue packets on device · 5d097109
      Michael S. Tsirkin authored
      Historically tun supported two modes of operation:
      - in default mode, a small number of packets would get queued
        at the device, the rest would be queued in qdisc
      - in one queue mode, all packets would get queued at the device
      This might have made sense up to a point where we made the
      queue depth for both modes the same and set it to
      a huge value (500) so unless the consumer
      is stuck the chance of losing packets is small.
      Thus in practice both modes behave the same, but the
      default mode has some problems:
      - if packets are never consumed, fragments are never orphaned
        which cases a DOS for sender using zero copy transmit
      - overrun errors are hard to diagnose: fifo error is incremented
        only once so you can not distinguish between
        userspace that is stuck and a transient failure,
        tcpdump on the device does not show any traffic
      Userspace solves this simply by enabling IFF_ONE_QUEUE
      but there seems to be little point in not doing the
      right thing for everyone, by default.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Michele Baldessari's avatar
      sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call · 196d6759
      Michele Baldessari authored
      The current SCTP stack is lacking a mechanism to have per association
      statistics. This is an implementation modeled after OpenSolaris'
      Userspace part will follow on lksctp if/when there is a general ACK on
      - Move ipackets++ before q->immediate.func() for consistency reasons
      - Move sctp_max_rto() at the end of sctp_transport_update_rto() to avoid
        returning bogus RTO values
      - return asoc->rto_min when max_obs_rto value has not changed
      - Increase ictrlchunks in sctp_assoc_bh_rcv() as well
      - Move ipackets++ to sctp_inq_push()
      - return 0 when no rto updates took place since the last call
      - Implement partial retrieval of stat struct to cope for future expansion
      - Kill the rtxpackets counter as it cannot be precise anyway
      - Rename outseqtsns to outofseqtsns to make it clearer that these are out
        of sequence unexpected TSNs
      - Move asoc->ipackets++ under a lock to avoid potential miscounts
      - Fold asoc->opackets++ into the already existing asoc check
      - Kill unneeded (q->asoc) test when increasing rtxchunks
      - Do not count octrlchunks if sending failed (SCTP_XMIT_OK != 0)
      - Don't count SHUTDOWNs as SACKs
      - Move SCTP_GET_ASSOC_STATS to the private space API
      - Adjust the len check in sctp_getsockopt_assoc_stats() to allow for
        future struct growth
      - Move association statistics in their own struct
      - Update idupchunks when we send a SACK with dup TSNs
      - return min_rto in max_rto when RTO has not changed. Also return the
        transport when max_rto last changed.
      Signed-off: Michele Baldessari <michele@acksyn.org>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  4. 02 Dec, 2012 1 commit
  5. 01 Dec, 2012 2 commits
  6. 30 Nov, 2012 2 commits
    • Eric Dumazet's avatar
      net: move inet_dport/inet_num in sock_common · ce43b03e
      Eric Dumazet authored
      commit 68835aba (net: optimize INET input path further)
      moved some fields used for tcp/udp sockets lookup in the first cache
      line of struct sock_common.
      This patch moves inet_dport/inet_num as well, filling a 32bit hole
      on 64 bit arches and reducing number of cache line misses in lookups.
      Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
      before addresses match, as this check is more discriminant.
      Remove the hash check from MATCH() macros because we dont need to
      re validate the hash value after taking a refcount on socket, and
      use likely/unlikely compiler hints, as the sk_hash/hash check
      makes the following conditional tests 100% predicted by cpu.
      Introduce skc_addrpair/skc_portpair pair values to better
      document the alignment requirements of the port/addr pairs
      used in the various MATCH() macros, and remove some casts.
      The namespace check can also be done at last.
      This slightly improves TCP/UDP lookup times.
      IP/TCP early demux needs inet->rx_dst_ifindex and
      TCP needs inet->min_ttl, lets group them together in same cache line.
      With help from Ben Hutchings & Joe Perches.
      Idea of this patch came after Ling Ma proposal to move skc_hash
      to the beginning of struct sock_common, and should allow him
      to submit a final version of his patch. My tests show an improvement
      doing so.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Ling Ma <ling.ma.program@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Rami Rosen's avatar
      rtnelink: remove unused parameter from rtnl_create_link(). · c0713563
      Rami Rosen authored
      This patch removes an unused parameter (src_net) from rtnl_create_link()
      method and from the method single invocation, in veth.
      This parameter was used in the past when calling
      ops->get_tx_queues(src_net, tb) in rtnl_create_link().
      The get_tx_queues() member of rtnl_link_ops was replaced by two methods,
      get_num_tx_queues() and get_num_rx_queues(), which do not get any
      parameter. This was done in commit d40156aa by
      Jiri Pirko ("rtnl: allow to specify different num for rx and tx queue count").
      Signed-off-by: default avatarRami Rosen <ramirose@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  7. 29 Nov, 2012 1 commit
  8. 28 Nov, 2012 1 commit
  9. 27 Nov, 2012 1 commit
  10. 26 Nov, 2012 17 commits
    • Mel Gorman's avatar
      Revert "mm: remove __GFP_NO_KSWAPD" · 82b212f4
      Mel Gorman authored
      With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
      based on failures" reverted, Zdenek Kabelac reported the following
        Hmm,  so it's just took longer to hit the problem and observe
        kswapd0 spinning on my CPU again - it's not as endless like before -
        but still it easily eats minutes - it helps to	turn off  Firefox
        or TB  (memory hungry apps) so kswapd0 stops soon - and restart
        those apps again.  (And I still have like >1GB of cached memory)
        kswapd0         R  running task        0    30      2 0x00000000
        Call Trace:
      The sysrq+m indicates the system has no swap so it'll never reclaim
      anonymous pages as part of reclaim/compaction.  That is one part of the
      problem but not the root cause as file-backed pages could also be
      The likely underlying problem is that kswapd is woken up or kept awake
      for each THP allocation request in the page allocator slow path.
      If compaction fails for the requesting process then compaction will be
      deferred for a time and direct reclaim is avoided.  However, if there
      are a storm of THP requests that are simply rejected, it will still be
      the the case that kswapd is awake for a prolonged period of time as
      pgdat->kswapd_max_order is updated each time.  This is noticed by the
      main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
      it will loopp, shrinking a small number of pages and calling
      shrink_slab() on each iteration.
      The temptation is to supply a patch that checks if kswapd was woken for
      THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
      backed up by proper testing.  As 3.7 is very close to release and this
      is not a bug we should release with, a safer path is to revert "mm:
      remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
      out the balance_pgdat() logic in general.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Zdenek Kabelac <zkabelac@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: Jiri Slaby <jirislaby@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Tushar Behera's avatar
      include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID · c5782e9f
      Tushar Behera authored
      Commit baf05aa9 ("bug: introduce BUILD_BUG_ON_INVALID() macro")
      introduces this macro only when _CHECKER_ is not defined.  Define a
      silent macro in the else condition to fix following sparse warning:
        mm/filemap.c:395:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:396:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        mm/filemap.c:397:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: undefined identifier 'BUILD_BUG_ON_INVALID'
        include/linux/mm.h:419:9: error: not a function <noident>
      Signed-off-by: default avatarTushar Behera <tushar.behera@linaro.org>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Brian Haley's avatar
      sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name · c91f6df2
      Brian Haley authored
      Instead of having the getsockopt() of SO_BINDTODEVICE return an index, which
      will then require another call like if_indextoname() to get the actual interface
      name, have it return the name directly.
      This also matches the existing man page description on socket(7) which mentions
      the argument being an interface name.
      If the value has not been set, zero is returned and optlen will be set to zero
      to indicate there is no interface name present.
      Added a seqlock to protect this code path, and dev_ifname(), from someone
      changing the device name via dev_change_name().
      v2: Added seqlock protection while copying device name.
      v3: Fixed word wrap in patch.
      Signed-off-by: default avatarBrian Haley <brian.haley@hp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Giuseppe CAVALLARO's avatar
      stmmac: add Rx watchdog support to mitigate the DMA irqs · 62a2ab93
      Giuseppe CAVALLARO authored
      GMAC devices newer than databook 3.40 has an embedded timer
      that can be used for mitigating the number of interrupts.
      So this patch adds this optimizations.
      At any rate, the Rx watchdog can be disable (on bugged HW) by
      passing from the platform the riwt_off field.
      In this implementation the rx timer stored in the Reg9 is fixed
      to the max value. This will be tuned by using ethtool.
      V2: added a platform parameter to force to disable the rx-watchdog
      for example on new core where it is bugged.
      V3: do not disable NAPI when Rx watchdog is used.
      V4: a new extra statistic field has been added to show the early
      receive status in the interrupt handler.
      This patch also adds an extra check to avoid to call
      napi_schedule when the DMA_INTR_ENA_RIE bit is disabled in the
      Interrupt Mask register.
      Signed-off-by: default avatarGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Hauke Mehrtens's avatar
      bcma: add more package IDs · 0751f865
      Hauke Mehrtens authored
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Ansis Atteka's avatar
      openvswitch: add skb mark matching and set action · 39c7caeb
      Ansis Atteka authored
      This patch adds support for skb mark matching and set action.
      Signed-off-by: default avatarAnsis Atteka <aatteka@nicira.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
    • Johannes Berg's avatar
      wireless: add definitions for VHT MCS support · 7173a1fa
      Johannes Berg authored
      Add definitions for the VHT MCS support values that
      are used to indicate, for each number of streams
      (1 through 8) which MCSes are supported.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      mac80211: support VHT rates in TX info · 8bc83c24
      Johannes Berg authored
      To achieve this, limit the number of retries to
      31 (instead of 255) and use the three bits that
      are then free for VHT flags.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      mac80211: support drivers reporting VHT RX · 5614618e
      Johannes Berg authored
      Add support to mac80211 for having drivers report
      received VHT MCS information.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      nl80211/cfg80211: add VHT MCS support · db9c64cf
      Johannes Berg authored
      Add support for reporting and calculating VHT MCSes.
      Note that I'm not completely sure that the bitrate
      calculations are correct, nor that they can't be
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      mac80211: convert to channel definition struct · 4bf88530
      Johannes Berg authored
      Convert mac80211 (and where necessary, some drivers a
      little bit) to the new channel definition struct.
      This will allow extending mac80211 for VHT, which is
      currently restricted to channel contexts since there
      are no drivers using that which makes it easier. As
      I also don't care about VHT for drivers not using the
      channel context API, I won't convert the previous API
      to VHT support.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      nl80211/cfg80211: support VHT channel configuration · 3d9d1d66
      Johannes Berg authored
      Change nl80211 to support specifying a VHT (or HT)
      using the control channel frequency (as before) and
      new attributes for the channel width and first and
      second center frequency. The old channel type is of
      course still supported for HT.
      Also change the cfg80211 channel definition struct
      to support these by adding the relevant fields to
      it (and removing the _type field.)
      This also adds new helper functions:
       - cfg80211_chandef_create to create a channel def
         struct given the control channel and channel type,
       - cfg80211_chandef_identical to check if two channel
         definitions are identical
       - cfg80211_chandef_compatible to check if the given
         channel definitions are compatible, and return the
         wider of the two
      This isn't entirely complete, but that doesn't matter
      until we have a driver using it. In particular, it's
       - regulatory checks on the usable bandwidth (if that
         even makes sense)
       - regulatory TX power (database can't deal with it)
       - a proper channel compatibility calculation for the
         new channel types
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
      cfg80211: pass a channel definition struct · 683b6d3b
      Johannes Berg authored
      Instead of passing a channel pointer and channel type
      to all functions and driver methods, pass a new channel
      definition struct. Right now, this struct contains just
      the control channel and channel type, but for VHT this
      will change.
      Also, add a small inline cfg80211_get_chandef_type() so
      that drivers don't need to use the _type field of the
      new structure all the time, which will change.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Johannes Berg's avatar
    • Johannes Berg's avatar
      cfg80211: remove remain-on-channel channel type · 42d97a59
      Johannes Berg authored
      As mwifiex (and mac80211 in the software case) are the
      only drivers actually implementing remain-on-channel
      with channel type, userspace can't be relying on it.
      This is the case, as it's used only for P2P operations
      right now.
      Rather than adding a flag to tell userspace whether or
      not it can actually rely on it, simplify all the code
      by removing the ability to use different channel types.
      Leave only the validation of the attribute, so that if
      we extend it again later (with the needed capability
      flag), it can't break userspace sending invalid data.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
    • Marco Porsch's avatar
    • Arend van Spriel's avatar
      cfg80211: change function signature of cfg80211_get_p2p_attr() · c216e641
      Arend van Spriel authored
      The function cfg80211_get_p2p_attr() can fail and returns
      a negative error code. However, the return type is unsigned
      int. The largest positive number is determined by desired_len
      variable in the function, which is u16. So changing the return
      type to int to allow easy error checking. Also change the type
      for the attribute to enum for improved type checking.
      Signed-off-by: default avatarArend van Spriel <arend@broadcom.com>
      [fix indentation, don't use u8 attr variable]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
  11. 23 Nov, 2012 3 commits
  12. 21 Nov, 2012 4 commits
  13. 20 Nov, 2012 3 commits
    • Neil Horman's avatar
      sctp: send abort chunk when max_retrans exceeded · de4594a5
      Neil Horman authored
      In the event that an association exceeds its max_retrans attempts, we should
      send an ABORT chunk indicating that we are closing the assocation as a result.
      Because of the nature of the error, its unlikely to be received, but its a nice
      clean way to close the association if it does make it through, and it will give
      anyone watching via tcpdump a clue as to what happened.
      Change notes:
      	* Removed erroneous changes from sctp_make_violation_parmlen
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Vlad Yasevich <vyasevich@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: linux-sctp@vger.kernel.org
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Nicolas Dichtel's avatar
      sit: allow to configure 6rd tunnels via netlink · e2f1f072
      Nicolas Dichtel authored
      This patch add the support of 6RD tunnels management via netlink.
      Note that netdev_state_change() is now called when 6RD parameters are updated.
      6RD parameters are updated only if there is at least one 6RD attribute.
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David Stevens's avatar
      add DOVE extensions for VXLAN · e4f67add
      David Stevens authored
      This patch provides extensions to VXLAN for supporting Distributed
      Overlay Virtual Ethernet (DOVE) networks. The patch includes:
      	+ a dove flag per VXLAN device to enable DOVE extensions
      	+ ARP reduction, whereby a bridge-connected VXLAN tunnel endpoint
      		answers ARP requests from the local bridge on behalf of
      		remote DOVE clients
      	+ route short-circuiting (aka L3 switching). Known destination IP
      		addresses use the corresponding destination MAC address for
      		switching rather than going to a (possibly remote) router first.
      	+ netlink notification messages for forwarding table and L3 switching
      Changes since v2
      	- combined bools into "u32 flags"
      	- replaced loop with !is_zero_ether_addr()
      Signed-off-by: default avatarDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  14. 19 Nov, 2012 1 commit