1. 16 May, 2012 8 commits
  2. 15 May, 2012 7 commits
  3. 08 May, 2012 12 commits
    • Ashok Nagarajan's avatar
      {nl,cfg,mac}80211: Allow user to see/configure HT protection mode · 70c33eaa
      Ashok Nagarajan authored
      This patch introduces a new mesh configuration parameter "ht_opmode" and will
      allow user to check the current HT protection mode selected. Users could
      configure the protection mode by the command "iw mesh_iface set mesh_param
      mesh_ht_protection_mode=2". The default protection mode of mesh is set to
      non-HT mixed mode.
      Signed-off-by: default avatarAshok Nagarajan <ashok@cozybit.com>
      Reviewed-by: default avatarThomas Pedersen <thomas@cozybit.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Ben Greear's avatar
      mac80211: Framework to get wifi-driver stats via ethtool. · e352114f
      Ben Greear authored
      This adds hooks to call into the driver to get additional
      stats for the ethtool API.
      Signed-off-by: default avatarBen Greear <greearb@candelatech.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Ben Greear's avatar
    • Pablo Neira Ayuso's avatar
      netfilter: remove ip_queue support · d16cf20e
      Pablo Neira Ayuso authored
      This patch removes ip_queue support which was marked as obsolete
      years ago. The nfnetlink_queue modules provides more advanced
      user-space packet queueing mechanism.
      This patch also removes capability code included in SELinux that
      refers to ip_queue. Otherwise, we break compilation.
      Several warning has been sent regarding this to the mailing list
      in the past month without anyone rising the hand to stop this
      with some strong argument.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conntrack: fix explicit helper attachment and NAT · 6714cf54
      Pablo Neira Ayuso authored
      Explicit helper attachment via the CT target is broken with NAT
      if non-standard ports are used. This problem was hidden behind
      the automatic helper assignment routine. Thus, it becomes more
      noticeable now that we can disable the automatic helper assignment
      with Eric Leblond's:
      9e8ac5a netfilter: nf_ct_helper: allow to disable automatic helper assignment
      Basically, nf_conntrack_alter_reply asks for looking up the helper
      up if NAT is enabled. Unfortunately, we don't have the conntrack
      template at that point anymore.
      Since we don't want to rely on the automatic helper assignment,
      we can skip the second look-up and stick to the helper that was
      attached by iptables. With the CT target, the user is in full
      control of helper attachment, thus, the policy is to trust what
      the user explicitly configures via iptables (no automatic magic
      Interestingly, this bug was hidden by the automatic helper look-up
      code. But it can be easily trigger if you attach the helper in
      a non-standard port, eg.
      iptables -I PREROUTING -t raw -p tcp --dport 8888 \
      	-j CT --helper ftp
      And you disabled the automatic helper assignment.
      I added the IPS_HELPER_BIT that allows us to differenciate between
      a helper that has been explicitly attached and those that have been
      automatically assigned. I didn't come up with a better solution
      (having backward compatibility in mind).
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Pablo Neira Ayuso's avatar
      ipvs: add support for sync threads · f73181c8
      Pablo Neira Ayuso authored
      	Allow master and backup servers to use many threads
      for sync traffic. Add sysctl var "sync_ports" to define the
      number of threads. Every thread will use single UDP port,
      thread 0 will use the default port 8848 while last thread
      will use port 8848+sync_ports-1.
      	The sync traffic for connections is scheduled to many
      master threads based on the cp address but one connection is
      always assigned to same thread to avoid reordering of the
      sync messages.
      	Remove ip_vs_sync_switch_mode because this check
      for sync mode change is still risky. Instead, check for mode
      change under sync_buff_lock.
      	Make sure the backup socks do not block on reading.
      Special thanks to Aleksey Chudov for helping in all tests.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Tested-by: default avatarAleksey Chudov <aleksey.chudov@gmail.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    • Julian Anastasov's avatar
      ipvs: reduce sync rate with time thresholds · 749c42b6
      Julian Anastasov authored
      	Add two new sysctl vars to control the sync rate with the
      main idea to reduce the rate for connection templates because
      currently it depends on the packet rate for controlled connections.
      This mechanism should be useful also for normal connections
      with high traffic.
      sync_refresh_period: in seconds, difference in reported connection
      	timer that triggers new sync message. It can be used to
      	avoid sync messages for the specified period (or half of
      	the connection timeout if it is lower) if connection state
      	is not changed from last sync.
      sync_retries: integer, 0..3, defines sync retries with period of
      	sync_refresh_period/8. Useful to protect against loss of
      	sync messages.
      	Allow sysctl_sync_threshold to be used with
      sysctl_sync_period=0, so that only single sync message is sent
      if sync_refresh_period is also 0.
      	Add new field "sync_endtime" in connection structure to
      hold the reported time when connection expires. The 2 lowest
      bits will represent the retry count.
      	As the sysctl_sync_period now can be 0 use ACCESS_ONCE to
      avoid division by zero.
      	Special thanks to Aleksey Chudov for being patient with me,
      for his extensive reports and helping in all tests.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Tested-by: default avatarAleksey Chudov <aleksey.chudov@gmail.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    • Pablo Neira Ayuso's avatar
      ipvs: wakeup master thread · 1c003b15
      Pablo Neira Ayuso authored
      	High rate of sync messages in master can lead to
      overflowing the socket buffer and dropping the messages.
      Fixed sleep of 1 second without wakeup events is not suitable
      for loaded masters,
      	Use delayed_work to schedule sending for queued messages
      and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will
      reduce the rate of wakeups but to avoid sending long bursts we
      wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages.
      	Add hard limit for the queued messages before sending
      by using "sync_qlen_max" sysctl var. It defaults to 1/32 of
      the memory pages but actually represents number of messages.
      It will protect us from allocating large parts of memory
      when the sending rate is lower than the queuing rate.
      	As suggested by Pablo, add new sysctl var
      "sync_sock_size" to configure the SNDBUF (master) or
      RCVBUF (slave) socket limit. Default value is 0 (preserve
      system defaults).
      	Change the master thread to detect and block on
      SNDBUF overflow, so that we do not drop messages when
      the socket limit is low but the sync_qlen_max limit is
      not reached. On ENOBUFS or other errors just drop the
      	Change master thread to enter TASK_INTERRUPTIBLE
      state early, so that we do not miss wakeups due to messages or
      kthread_should_stop event.
      Thanks to Pablo Neira Ayuso for his valuable feedback!
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    • Julian Anastasov's avatar
      ipvs: always update some of the flags bits in backup · cdcc5e90
      Julian Anastasov authored
      	As the goal is to mirror the inactconns/activeconns
      counters in the backup server, make sure the cp->flags are
      updated even if cp is still not bound to dest. If cp->flags
      are not updated ip_vs_bind_dest will rely only on the initial
      flags when updating the counters. To avoid mistakes and
      complicated checks for protocol state rely only on the
      IP_VS_CONN_F_INACTIVE bit when updating the counters.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Tested-by: default avatarAleksey Chudov <aleksey.chudov@gmail.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    • Eric Dumazet's avatar
      netfilter: nf_conntrack: use this_cpu_inc() · ac3a546a
      Eric Dumazet authored
      this_cpu_inc() is IRQ safe and faster than
      local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Tejun Heo <tj@kernel.org>
      Reviewed-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Eric Leblond's avatar
      netfilter: nf_ct_helper: allow to disable automatic helper assignment · a9006892
      Eric Leblond authored
      This patch allows you to disable automatic conntrack helper
      lookup based on TCP/UDP ports, eg.
      echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper
      [ Note: flows that already got a helper will keep using it even
        if automatic helper assignment has been disabled ]
      Once this behaviour has been disabled, you have to explicitly
      use the iptables CT target to attach helper to flows.
      There are good reasons to stop supporting automatic helper
      assignment, for further information, please read:
      This patch also adds one message to inform that automatic helper
      assignment is deprecated and it will be removed soon (this is
      spotted only once, with the first flow that gets a helper attached
      to make it as less annoying as possible).
      Signed-off-by: default avatarEric Leblond <eric@regit.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Joe Perches's avatar
      etherdev.h: Convert int is_<foo>_ether_addr to bool · b44907e6
      Joe Perches authored
      Make the return value explicitly true or false.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  4. 07 May, 2012 3 commits
    • David Daney's avatar
      netdev/of/phy: Add MDIO bus multiplexer support. · 0ca2997d
      David Daney authored
      This patch adds a somewhat generic framework for MDIO bus
      multiplexers.  It is modeled on the I2C multiplexer.
      The multiplexer is needed if there are multiple PHYs with the same
      address connected to the same MDIO bus adepter, or if there is
      insufficient electrical drive capability for all the connected PHY
      Conceptually it could look something like this:
                         | Control Signal |
       ---------------   --------+------
       | MDIO MASTER |---| Multiplexer |
       ---------------   --+-------+----
                           |       |
                           C       C
                           h       h
                           i       i
                           l       l
                           d       d
                           |       |
           ---------       A       B   ---------
           |       |       |       |   |       |
           | PHY@1 +-------+       +---+ PHY@1 |
           |       |       |       |   |       |
           ---------       |       |   ---------
           ---------       |       |   ---------
           |       |       |       |   |       |
           | PHY@2 +-------+       +---+ PHY@2 |
           |       |                   |       |
           ---------                   ---------
      This framework configures the bus topology from device tree data.  The
      mechanics of switching the multiplexer is left to device specific
      The follow-on patch contains a multiplexer driven by GPIO lines.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David Daney's avatar
      netdev/of/phy: New function: of_mdio_find_bus(). · 25106022
      David Daney authored
      Add of_mdio_find_bus() which allows an mii_bus to be located given its
      associated the device tree node.
      This is needed by the follow-on patch to add a driver for MDIO bus
      The of_mdiobus_register() function is modified so that the device tree
      node is recorded in the mii_bus.  Then we can find it again by
      iterating over all mdio_bus_class devices.
      Because the OF device tree has now become an integral part of the
      kernel, this can live in mdio_bus.c (which contains the needed
      mdio_bus_class structure) instead of of_mdio.c.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Johannes Berg's avatar
      net: compare_ether_addr[_64bits]() has no ordering · 1c430a72
      Johannes Berg authored
      Neither compare_ether_addr() nor compare_ether_addr_64bits()
      (as it can fall back to the former) have comparison semantics
      like memcmp() where the sign of the return value indicates sort
      order. We had a bug in the wireless code due to a blind memcmp
      replacement because of this.
      A cursory look suggests that the wireless bug was the only one
      due to this semantic difference.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  5. 06 May, 2012 1 commit
  6. 04 May, 2012 4 commits
  7. 03 May, 2012 1 commit
  8. 02 May, 2012 3 commits
    • Eric Dumazet's avatar
      net: implement tcp coalescing in tcp_queue_rcv() · b081f85c
      Eric Dumazet authored
      Extend tcp coalescing implementing it from tcp_queue_rcv(), the main
      receiver function when application is not blocked in recvmsg().
      Function tcp_queue_rcv() is moved a bit to allow its call from
      This gives good results especially if GRO could not kick, and if skb
      head is a fragment.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.h.duyck@intel.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Yuchung Cheng's avatar
      tcp: early retransmit: delayed fast retransmit · 750ea2ba
      Yuchung Cheng authored
      Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
      Delays the fast retransmit by an interval of RTT/4. We borrow the
      RTO timer to implement the delay. If we receive another ACK or send
      a new packet, the timer is cancelled and restored to original RTO
      value offset by time elapsed.  When the delayed-ER timer fires,
      we enter fast recovery and perform fast retransmit.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Yuchung Cheng's avatar
      tcp: early retransmit · eed530b6
      Yuchung Cheng authored
      This patch implements RFC 5827 early retransmit (ER) for TCP.
      It reduces DUPACK threshold (dupthresh) if outstanding packets are
      less than 4 to recover losses by fast recovery instead of timeout.
      While the algorithm is simple, small but frequent network reordering
      makes this feature dangerous: the connection repeatedly enter
      false recovery and degrade performance. Therefore we implement
      a mitigation suggested in the appendix of the RFC that delays
      entering fast recovery by a small interval, i.e., RTT/4. Currently
      ER is conservative and is disabled for the rest of the connection
      after the first reordering event. A large scale web server
      experiment on the performance impact of ER is summarized in
      section 6 of the paper "Proportional Rate Reduction for TCP”,
      IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf
      Note that Linux has a similar feature called THIN_DUPACK. The
      differences are THIN_DUPACK do not mitigate reorderings and is only
      used after slow start. Currently ER is disabled if THIN_DUPACK is
      enabled. I would be happy to merge THIN_DUPACK feature with ER if
      people think it's a good idea.
      ER is enabled by sysctl_tcp_early_retrans:
        0: Disables ER
        1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4.
        2: (Default) reduce dupthresh like mode 1. In addition, delay
           entering fast recovery by RTT/4.
      Note: mode 2 is implemented in the third part of this patch series.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  9. 01 May, 2012 1 commit