1. 16 May, 2012 8 commits
  2. 15 May, 2012 2 commits
  3. 08 May, 2012 5 commits
    • Ashok Nagarajan's avatar
      {nl,cfg,mac}80211: Allow user to see/configure HT protection mode · 70c33eaa
      Ashok Nagarajan authored
      This patch introduces a new mesh configuration parameter "ht_opmode" and will
      allow user to check the current HT protection mode selected. Users could
      configure the protection mode by the command "iw mesh_iface set mesh_param
      mesh_ht_protection_mode=2". The default protection mode of mesh is set to
      non-HT mixed mode.
      Signed-off-by: default avatarAshok Nagarajan <ashok@cozybit.com>
      Reviewed-by: default avatarThomas Pedersen <thomas@cozybit.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
    • Pablo Neira Ayuso's avatar
      netfilter: remove ip_queue support · d16cf20e
      Pablo Neira Ayuso authored
      This patch removes ip_queue support which was marked as obsolete
      years ago. The nfnetlink_queue modules provides more advanced
      user-space packet queueing mechanism.
      This patch also removes capability code included in SELinux that
      refers to ip_queue. Otherwise, we break compilation.
      Several warning has been sent regarding this to the mailing list
      in the past month without anyone rising the hand to stop this
      with some strong argument.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conntrack: fix explicit helper attachment and NAT · 6714cf54
      Pablo Neira Ayuso authored
      Explicit helper attachment via the CT target is broken with NAT
      if non-standard ports are used. This problem was hidden behind
      the automatic helper assignment routine. Thus, it becomes more
      noticeable now that we can disable the automatic helper assignment
      with Eric Leblond's:
      9e8ac5a netfilter: nf_ct_helper: allow to disable automatic helper assignment
      Basically, nf_conntrack_alter_reply asks for looking up the helper
      up if NAT is enabled. Unfortunately, we don't have the conntrack
      template at that point anymore.
      Since we don't want to rely on the automatic helper assignment,
      we can skip the second look-up and stick to the helper that was
      attached by iptables. With the CT target, the user is in full
      control of helper attachment, thus, the policy is to trust what
      the user explicitly configures via iptables (no automatic magic
      Interestingly, this bug was hidden by the automatic helper look-up
      code. But it can be easily trigger if you attach the helper in
      a non-standard port, eg.
      iptables -I PREROUTING -t raw -p tcp --dport 8888 \
      	-j CT --helper ftp
      And you disabled the automatic helper assignment.
      I added the IPS_HELPER_BIT that allows us to differenciate between
      a helper that has been explicitly attached and those that have been
      automatically assigned. I didn't come up with a better solution
      (having backward compatibility in mind).
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
    • Julian Anastasov's avatar
      ipvs: always update some of the flags bits in backup · cdcc5e90
      Julian Anastasov authored
      	As the goal is to mirror the inactconns/activeconns
      counters in the backup server, make sure the cp->flags are
      updated even if cp is still not bound to dest. If cp->flags
      are not updated ip_vs_bind_dest will rely only on the initial
      flags when updating the counters. To avoid mistakes and
      complicated checks for protocol state rely only on the
      IP_VS_CONN_F_INACTIVE bit when updating the counters.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Tested-by: default avatarAleksey Chudov <aleksey.chudov@gmail.com>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
    • Joe Perches's avatar
      etherdev.h: Convert int is_<foo>_ether_addr to bool · b44907e6
      Joe Perches authored
      Make the return value explicitly true or false.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  4. 07 May, 2012 3 commits
    • David Daney's avatar
      netdev/of/phy: Add MDIO bus multiplexer support. · 0ca2997d
      David Daney authored
      This patch adds a somewhat generic framework for MDIO bus
      multiplexers.  It is modeled on the I2C multiplexer.
      The multiplexer is needed if there are multiple PHYs with the same
      address connected to the same MDIO bus adepter, or if there is
      insufficient electrical drive capability for all the connected PHY
      Conceptually it could look something like this:
                         | Control Signal |
       ---------------   --------+------
       | MDIO MASTER |---| Multiplexer |
       ---------------   --+-------+----
                           |       |
                           C       C
                           h       h
                           i       i
                           l       l
                           d       d
                           |       |
           ---------       A       B   ---------
           |       |       |       |   |       |
           | PHY@1 +-------+       +---+ PHY@1 |
           |       |       |       |   |       |
           ---------       |       |   ---------
           ---------       |       |   ---------
           |       |       |       |   |       |
           | PHY@2 +-------+       +---+ PHY@2 |
           |       |                   |       |
           ---------                   ---------
      This framework configures the bus topology from device tree data.  The
      mechanics of switching the multiplexer is left to device specific
      The follow-on patch contains a multiplexer driven by GPIO lines.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David Daney's avatar
      netdev/of/phy: New function: of_mdio_find_bus(). · 25106022
      David Daney authored
      Add of_mdio_find_bus() which allows an mii_bus to be located given its
      associated the device tree node.
      This is needed by the follow-on patch to add a driver for MDIO bus
      The of_mdiobus_register() function is modified so that the device tree
      node is recorded in the mii_bus.  Then we can find it again by
      iterating over all mdio_bus_class devices.
      Because the OF device tree has now become an integral part of the
      kernel, this can live in mdio_bus.c (which contains the needed
      mdio_bus_class structure) instead of of_mdio.c.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Johannes Berg's avatar
      net: compare_ether_addr[_64bits]() has no ordering · 1c430a72
      Johannes Berg authored
      Neither compare_ether_addr() nor compare_ether_addr_64bits()
      (as it can fall back to the former) have comparison semantics
      like memcmp() where the sign of the return value indicates sort
      order. We had a bug in the wireless code due to a blind memcmp
      replacement because of this.
      A cursory look suggests that the wireless bug was the only one
      due to this semantic difference.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  5. 06 May, 2012 1 commit
  6. 04 May, 2012 3 commits
  7. 03 May, 2012 1 commit
  8. 02 May, 2012 2 commits
    • Yuchung Cheng's avatar
      tcp: early retransmit: delayed fast retransmit · 750ea2ba
      Yuchung Cheng authored
      Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
      Delays the fast retransmit by an interval of RTT/4. We borrow the
      RTO timer to implement the delay. If we receive another ACK or send
      a new packet, the timer is cancelled and restored to original RTO
      value offset by time elapsed.  When the delayed-ER timer fires,
      we enter fast recovery and perform fast retransmit.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Yuchung Cheng's avatar
      tcp: early retransmit · eed530b6
      Yuchung Cheng authored
      This patch implements RFC 5827 early retransmit (ER) for TCP.
      It reduces DUPACK threshold (dupthresh) if outstanding packets are
      less than 4 to recover losses by fast recovery instead of timeout.
      While the algorithm is simple, small but frequent network reordering
      makes this feature dangerous: the connection repeatedly enter
      false recovery and degrade performance. Therefore we implement
      a mitigation suggested in the appendix of the RFC that delays
      entering fast recovery by a small interval, i.e., RTT/4. Currently
      ER is conservative and is disabled for the rest of the connection
      after the first reordering event. A large scale web server
      experiment on the performance impact of ER is summarized in
      section 6 of the paper "Proportional Rate Reduction for TCP”,
      IMC 2011. http://conferences.sigcomm.org/imc/2011/docs/p155.pdf
      Note that Linux has a similar feature called THIN_DUPACK. The
      differences are THIN_DUPACK do not mitigate reorderings and is only
      used after slow start. Currently ER is disabled if THIN_DUPACK is
      enabled. I would be happy to merge THIN_DUPACK feature with ER if
      people think it's a good idea.
      ER is enabled by sysctl_tcp_early_retrans:
        0: Disables ER
        1: Reduce dupthresh to packets_out - 1 when outstanding packets < 4.
        2: (Default) reduce dupthresh like mode 1. In addition, delay
           entering fast recovery by RTT/4.
      Note: mode 2 is implemented in the third part of this patch series.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  9. 01 May, 2012 6 commits
  10. 30 Apr, 2012 3 commits
    • Eric Dumazet's avatar
      net: make GRO aware of skb->head_frag · d7e8883c
      Eric Dumazet authored
      GRO can check if skb to be merged has its skb->head mapped to a page
      fragment, instead of a kmalloc() area.
      We 'upgrade' skb->head as a fragment in itself
      This avoids the frag_list fallback, and permits to build true GRO skb
      (one sk_buff and up to 16 fragments), using less memory.
      This reduces number of cache misses when user makes its copy, since a
      single sk_buff is fetched.
      This is a followup of patch "net: allow skb->head to be a page fragment"
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      net: allow skb->head to be a page fragment · d3836f21
      Eric Dumazet authored
      skb->head is currently allocated from kmalloc(). This is convenient but
      has the drawback the data cannot be converted to a page fragment if
      We have three spots were it hurts :
      1) GRO aggregation
       When a linear skb must be appended to another skb, GRO uses the
      frag_list fallback, very inefficient since we keep all struct sk_buff
      around. So drivers enabling GRO but delivering linear skbs to network
      stack aren't enabling full GRO power.
      2) splice(socket -> pipe).
       We must copy the linear part to a page fragment.
       This kind of defeats splice() purpose (zero copy claim)
      3) TCP coalescing.
       Recently introduced, this permits to group several contiguous segments
      into a single skb. This shortens queue lengths and save kernel memory,
      and greatly reduce probabilities of TCP collapses. This coalescing
      doesnt work on linear skbs (or we would need to copy data, this would be
      too slow)
      Given all these issues, the following patch introduces the possibility
      of having skb->head be a fragment in itself. We use a new skb flag,
      skb->head_frag to carry this information.
      build_skb() is changed to accept a frag_size argument. Drivers willing
      to provide a page fragment instead of kmalloc() data will set a non zero
      value, set to the fragment size.
      Then, on situations we need to convert the skb head to a frag in itself,
      we can check if skb->head_frag is set and avoid the copies or various
      fallbacks we have.
      This means drivers currently using frags could be updated to avoid the
      current skb->head allocation and reduce their memory footprint (aka skb
      truesize). (thats 512 or 1024 bytes saved per skb). This also makes
      bpf/netfilter faster since the 'first frag' will be part of skb linear
      part, no need to copy data.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Matthew Garrett's avatar
      efi: Add new variable attributes · 41b3254c
      Matthew Garrett authored
      More recent versions of the UEFI spec have added new attributes for
      variables. Add them.
      Signed-off-by: default avatarMatthew Garrett <mjg@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  11. 29 Apr, 2012 1 commit
    • Linus Torvalds's avatar
      pipes: add a "packetized pipe" mode for writing · 9883035a
      Linus Torvalds authored
      The actual internal pipe implementation is already really about
      individual packets (called "pipe buffers"), and this simply exposes that
      as a special packetized mode.
      When we are in the packetized mode (marked by O_DIRECT as suggested by
      Alan Cox), a write() on a pipe will not merge the new data with previous
      writes, so each write will get a pipe buffer of its own.  The pipe
      buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
      will tell the reader side to break the read at that boundary (and throw
      away any partial packet contents that do not fit in the read buffer).
      End result: as long as you do writes less than PIPE_BUF in size (so that
      the pipe doesn't have to split them up), you can now treat the pipe as a
      packet interface, where each read() system call will read one packet at
      a time.  You can just use a sufficiently big read buffer (PIPE_BUF is
      sufficient, since bigger than that doesn't guarantee atomicity anyway),
      and the return value of the read() will naturally give you the size of
      the packet.
      NOTE! We do not support zero-sized packets, and zero-sized reads and
      writes to a pipe continue to be no-ops.  Also note that big packets will
      currently be split at write time, but that the size at which that
      happens is not really specified (except that it's bigger than PIPE_BUF).
      Currently that limit is the system page size, but we might want to
      explicitly support bigger packets some day.
      The main user for this is going to be the autofs packet interface,
      allowing us to stop having to care so deeply about exact packet sizes
      (which have had bugs with 32/64-bit compatibility modes).  But user
      space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
      fail with an EINVAL on kernels that do not support this interface.
      Tested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: stable@kernel.org  # needed for systemd/autofs interaction fix
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  12. 28 Apr, 2012 1 commit
  13. 27 Apr, 2012 1 commit
  14. 26 Apr, 2012 2 commits
  15. 25 Apr, 2012 1 commit