1. 15 Dec, 2015 1 commit
  2. 28 Aug, 2015 4 commits
  3. 25 Aug, 2015 1 commit
  4. 31 Mar, 2015 1 commit
  5. 08 Sep, 2014 1 commit
    • Willem de Bruijn's avatar
      inet: remove dead inetpeer sequence code · a7f26b7e
      Willem de Bruijn authored
      inetpeer sequence numbers are no longer incremented, so no need to
      check and flush the tree. The function that increments the sequence
      number was already dead code and removed in in "ipv4: remove unused
      function" (068a6e18). Remove the code that checks for a change, too.
      
      Verifying that v4_seq and v6_seq are never incremented and thus that
      flush_check compares bp->flush_seq to 0 is trivial.
      
      The second part of the change removes flush_check completely even
      though bp->flush_seq is exactly !0 once, at initialization. This
      change is correct because the time this branch is true is when
      bp->root == peer_avl_empty_rcu, in which the branch and
      inetpeer_invalidate_tree are a NOOP.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7f26b7e
  6. 02 Jun, 2014 2 commits
    • Eric Dumazet's avatar
      net: fix inet_getid() and ipv6_select_ident() bugs · 39c36094
      Eric Dumazet authored
      I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
      is disabled.
      Note how GSO/TSO packets do not have monotonically incrementing ID.
      
      06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
      06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
      06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
      06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
      06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)
      
      It appears I introduced this bug in linux-3.1.
      
      inet_getid() must return the old value of peer->ip_id_count,
      not the new one.
      
      Lets revert this part, and remove the prevention of
      a null identification field in IPv6 Fragment Extension Header,
      which is dubious and not even done properly.
      
      Fixes: 87c48fa3 ("ipv6: make fragment identifications less predictable")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39c36094
    • Eric Dumazet's avatar
      inetpeer: get rid of ip_id_count · 73f156a6
      Eric Dumazet authored
      Ideally, we would need to generate IP ID using a per destination IP
      generator.
      
      linux kernels used inet_peer cache for this purpose, but this had a huge
      cost on servers disabling MTU discovery.
      
      1) each inet_peer struct consumes 192 bytes
      
      2) inetpeer cache uses a binary tree of inet_peer structs,
         with a nominal size of ~66000 elements under load.
      
      3) lookups in this tree are hitting a lot of cache lines, as tree depth
         is about 20.
      
      4) If server deals with many tcp flows, we have a high probability of
         not finding the inet_peer, allocating a fresh one, inserting it in
         the tree with same initial ip_id_count, (cf secure_ip_id())
      
      5) We garbage collect inet_peer aggressively.
      
      IP ID generation do not have to be 'perfect'
      
      Goal is trying to avoid duplicates in a short period of time,
      so that reassembly units have a chance to complete reassembly of
      fragments belonging to one message before receiving other fragments
      with a recycled ID.
      
      We simply use an array of generators, and a Jenkin hash using the dst IP
      as a key.
      
      ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
      belongs (it is only used from this file)
      
      secure_ip_id() and secure_ipv6_id() no longer are needed.
      
      Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
      unnecessary decrement/increment of the number of segments.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73f156a6
  7. 28 Dec, 2013 1 commit
  8. 21 Sep, 2013 1 commit
  9. 10 Jul, 2012 2 commits
  10. 11 Jun, 2012 3 commits
  11. 09 Jun, 2012 3 commits
  12. 08 Jun, 2012 2 commits
  13. 06 Jun, 2012 1 commit
  14. 08 Mar, 2012 2 commits
  15. 26 Nov, 2011 1 commit
  16. 22 Nov, 2011 1 commit
  17. 26 Jul, 2011 1 commit
  18. 21 Jul, 2011 1 commit
  19. 09 Jun, 2011 1 commit
    • Eric Dumazet's avatar
      inetpeer: lower false sharing effect · 2b77bdde
      Eric Dumazet authored
      Profiles show false sharing in addr_compare() because refcnt/dtime
      changes dirty the first inet_peer cache line, where are lying the keys
      used at lookup time. If many cpus are calling inet_getpeer() and
      inet_putpeer(), or need frag ids, addr_compare() is in 2nd position in
      "perf top".
      
      Before patch, my udpflood bench (16 threads) on my 2x4x2 machine :
      
                   5784.00  9.7% csum_partial_copy_generic [kernel]
                   3356.00  5.6% addr_compare              [kernel]
                   2638.00  4.4% fib_table_lookup          [kernel]
                   2625.00  4.4% ip_fragment               [kernel]
                   1934.00  3.2% neigh_lookup              [kernel]
                   1617.00  2.7% udp_sendmsg               [kernel]
                   1608.00  2.7% __ip_route_output_key     [kernel]
                   1480.00  2.5% __ip_append_data          [kernel]
                   1396.00  2.3% kfree                     [kernel]
                   1195.00  2.0% kmem_cache_free           [kernel]
                   1157.00  1.9% inet_getpeer              [kernel]
                   1121.00  1.9% neigh_resolve_output      [kernel]
                   1012.00  1.7% dev_queue_xmit            [kernel]
      # time ./udpflood.sh
      
      real	0m44.511s
      user	0m20.020s
      sys	11m22.780s
      
      # time ./udpflood.sh
      
      real	0m44.099s
      user	0m20.140s
      sys	11m15.870s
      
      After patch, no more addr_compare() in profiles :
      
                   4171.00 10.7% csum_partial_copy_generic   [kernel]
                   1787.00  4.6% fib_table_lookup            [kernel]
                   1756.00  4.5% ip_fragment                 [kernel]
                   1234.00  3.2% udp_sendmsg                 [kernel]
                   1191.00  3.0% neigh_lookup                [kernel]
                   1118.00  2.9% __ip_append_data            [kernel]
                   1022.00  2.6% kfree                       [kernel]
                    993.00  2.5% __ip_route_output_key       [kernel]
                    841.00  2.2% neigh_resolve_output        [kernel]
                    816.00  2.1% kmem_cache_free             [kernel]
                    658.00  1.7% ia32_sysenter_target        [kernel]
                    632.00  1.6% kmem_cache_alloc_node       [kernel]
      
      # time ./udpflood.sh
      
      real	0m41.587s
      user	0m19.190s
      sys	10m36.370s
      
      # time ./udpflood.sh
      
      real	0m41.486s
      user	0m19.290s
      sys	10m33.650s
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b77bdde
  20. 08 Jun, 2011 1 commit
    • Eric Dumazet's avatar
      inetpeer: remove unused list · 4b9d9be8
      Eric Dumazet authored
      Andi Kleen and Tim Chen reported huge contention on inetpeer
      unused_peers.lock, on memcached workload on a 40 core machine, with
      disabled route cache.
      
      It appears we constantly flip peers refcnt between 0 and 1 values, and
      we must insert/remove peers from unused_peers.list, holding a contended
      spinlock.
      
      Remove this list completely and perform a garbage collection on-the-fly,
      at lookup time, using the expired nodes we met during the tree
      traversal.
      
      This removes a lot of code, makes locking more standard, and obsoletes
      two sysctls (inet_peer_gc_mintime and inet_peer_gc_maxtime). This also
      removes two pointers in inet_peer structure.
      
      There is still a false sharing effect because refcnt is in first cache
      line of object [were the links and keys used by lookups are located], we
      might move it at the end of inet_peer structure to let this first cache
      line mostly read by cpus.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      CC: Andi Kleen <andi@firstfloor.org>
      CC: Tim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b9d9be8
  21. 22 Apr, 2011 1 commit
  22. 10 Feb, 2011 2 commits
  23. 04 Feb, 2011 1 commit
  24. 27 Jan, 2011 2 commits
  25. 01 Dec, 2010 2 commits
  26. 30 Nov, 2010 1 commit