1. 15 Nov, 2015 1 commit
  2. 17 Sep, 2015 1 commit
  3. 20 Aug, 2015 1 commit
  4. 21 Jul, 2015 1 commit
  5. 25 May, 2015 4 commits
  6. 01 May, 2015 2 commits
    • Martin KaFai Lau's avatar
      ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer · afc4eef8
      Martin KaFai Lau authored
      _rt6i_peer is no longer needed after the last patch,
      'ipv6: Stop rt6_info from using inet_peer's metrics'.
      
      DST_METRICS_FORCE_OVERWRITE is added by
      commit e5fd387a ("ipv6: do not overwrite inetpeer metrics prematurely").
      Since inetpeer is no longer used for metrics, this bit is also not needed.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Michal Kubeček <mkubecek@suse.cz>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afc4eef8
    • Martin KaFai Lau's avatar
      ipv6: Stop rt6_info from using inet_peer's metrics · 4b32b5ad
      Martin KaFai Lau authored
      inet_peer is indexed by the dst address alone.  However, the fib6 tree
      could have multiple routing entries (rt6_info) for the same dst. For
      example,
      1. A /128 dst via multiple gateways.
      2. A RTF_CACHE route cloned from a /128 route.
      
      In the above cases, all of them will share the same metrics and
      step on each other.
      
      This patch will steer away from inet_peer's metrics and use
      dst_cow_metrics_generic() for everything.
      
      Change Highlights:
      1. Remove rt6_cow_metrics() which currently acquires metrics from
         inet_peer for DST_HOST route (i.e. /128 route).
      2. Add rt6i_pmtu to take care of the pmtu update to avoid creating a
         full size metrics just to override the RTAX_MTU.
      3. After (2), the RTF_CACHE route can also share the metrics with its
         dst.from route, by:
         dst_init_metrics(&cache_rt->dst, dst_metrics_ptr(cache_rt->dst.from), true);
      4. Stop creating RTF_CACHE route by cloning another RTF_CACHE route.  Instead,
         directly clone from rt->dst.
      
         [ Currently, cloning from another RTF_CACHE is only possible during
           rt6_do_redirect().  Also, the old clone is removed from the tree
           immediately after the new clone is added. ]
      
         In case of cloning from an older redirect RTF_CACHE, it should work as
         before.
      
         In case of cloning from an older pmtu RTF_CACHE, this patch will forget
         the pmtu and re-learn it (if there is any) from the redirected route.
      
      The _rt6i_peer and DST_METRICS_FORCE_OVERWRITE will be removed
      in the next cleanup patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reviewed-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b32b5ad
  7. 05 Jan, 2015 1 commit
  8. 06 Oct, 2014 2 commits
  9. 30 Sep, 2014 1 commit
    • Hannes Frederic Sowa's avatar
      ipv6: remove rt6i_genid · 705f1c86
      Hannes Frederic Sowa authored
      Eric Dumazet noticed that all no-nonexthop or no-gateway routes which
      are already marked DST_HOST (e.g. input routes routes) will always be
      invalidated during sk_dst_check. Thus per-socket dst caching absolutely
      had no effect and early demuxing had no effect.
      
      Thus this patch removes rt6i_genid: fn_sernum already gets modified during
      add operations, so we only must ensure we mutate fn_sernum during ipv6
      address remove operations. This is a fairly cost extensive operations,
      but address removal should not happen that often. Also our mtu update
      functions do the same and we heard no complains so far. xfrm policy
      changes also cause a call into fib6_flush_trees. Also plug a hole in
      rt6_info (no cacheline changes).
      
      I verified via tracing that this change has effect.
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Cc: Martin Lau <kafai@fb.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      705f1c86
  10. 27 Mar, 2014 1 commit
    • Michal Kubeček's avatar
      ipv6: do not overwrite inetpeer metrics prematurely · e5fd387a
      Michal Kubeček authored
      If an IPv6 host route with metrics exists, an attempt to add a
      new route for the same target with different metrics fails but
      rewrites the metrics anyway:
      
      12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
      12sp0:~ # ip -6 route show
      fe80::/64 dev eth0  proto kernel  metric 256
      fec0::1 dev eth0  metric 1024  rto_min lock 1s
      12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500
      RTNETLINK answers: File exists
      12sp0:~ # ip -6 route show
      fe80::/64 dev eth0  proto kernel  metric 256
      fec0::1 dev eth0  metric 1024  rto_min lock 1.5s
      
      This is caused by all IPv6 host routes using the metrics in
      their inetpeer (or the shared default). This also holds for the
      new route created in ip6_route_add() which shares the metrics
      with the already existing route and thus ip6_route_add()
      rewrites the metrics even if the new route ends up not being
      used at all.
      
      Another problem is that old metrics in inetpeer can reappear
      unexpectedly for a new route, e.g.
      
      12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
      12sp0:~ # ip route del fec0::1
      12sp0:~ # ip route add fec0::1 dev eth0
      12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10
      12sp0:~ # ip -6 route show
      fe80::/64 dev eth0  proto kernel  metric 256
      fec0::1 dev eth0  metric 1024  hoplimit 10 rto_min lock 1s
      
      Resolve the first problem by moving the setting of metrics down
      into fib6_add_rt2node() to the point we are sure we are
      inserting the new route into the tree. Second problem is
      addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE
      which is set for a new host route in ip6_route_add() and makes
      ipv6_cow_metrics() always overwrite the metrics in inetpeer
      (even if they are not "new"); it is reset after that.
      
      v5: use a flag in _metrics member rather than one in flags
      
      v4: fix a typo making a condition always true (thanks to Hannes
      Frederic Sowa)
      
      v3: rewritten based on David Miller's idea to move setting the
      metrics (and allocation in non-host case) down to the point we
      already know the route is to be inserted. Also rebased to
      net-next as it is quite late in the cycle.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5fd387a
  11. 02 Jan, 2014 1 commit
  12. 25 Oct, 2013 1 commit
    • Hannes Frederic Sowa's avatar
      ipv6: reset dst.expires value when clearing expire flag · 01ba16d6
      Hannes Frederic Sowa authored
      On receiving a packet too big icmp error we update the expire value by
      calling rt6_update_expires. This function uses dst_set_expires which is
      implemented that it can only reduce the expiration value of the dst entry.
      
      If we insert new routing non-expiry information into the ipv6 fib where
      we already have a matching rt6_info we only clear the RTF_EXPIRES flag
      in rt6i_flags and leave the dst.expires value as is.
      
      When new mtu information arrives for that cached dst_entry we again
      call dst_set_expires. This time it won't update the dst.expire value
      because we left the dst.expire value intact from the last update. So
      dst_set_expires won't touch dst.expires.
      
      Fix this by resetting dst.expires when clearing the RTF_EXPIRE flag.
      dst_set_expires checks for a zero expiration and updates the
      dst.expires.
      
      In the past this (not updating dst.expires) was necessary because
      dst.expire was placed in a union with the dst_entry *from reference
      and rt6_clean_expires did assign NULL to it. This split happend in
      ecd98837 ("ipv6: fix race condition
      regarding dst->expires and dst->from").
      Reported-by: default avatarSteinar H. Gunderson <sgunderson@bigfoot.com>
      Reported-by: default avatarValentijn Sessink <valentyn@blub.net>
      Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarValentijn Sessink <valentyn@blub.net>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01ba16d6
  13. 27 Sep, 2013 1 commit
  14. 21 Sep, 2013 1 commit
  15. 01 Aug, 2013 1 commit
    • Michal Kubeček's avatar
      ipv6: prevent fib6_run_gc() contention · 2ac3ac8f
      Michal Kubeček authored
      On a high-traffic router with many processors and many IPv6 dst
      entries, soft lockup in fib6_run_gc() can occur when number of
      entries reaches gc_thresh.
      
      This happens because fib6_run_gc() uses fib6_gc_lock to allow
      only one thread to run the garbage collector but ip6_dst_gc()
      doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
      returns. On a system with many entries, this can take some time
      so that in the meantime, other threads pass the tests in
      ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
      the lock. They then have to run the garbage collector one after
      another which blocks them for quite long.
      
      Resolve this by replacing special value ~0UL of expire parameter
      to fib6_run_gc() by explicit "force" parameter to choose between
      spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
      force=false if gc_thresh is reached but not max_size.
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ac3ac8f
  16. 20 Feb, 2013 1 commit
  17. 17 Jan, 2013 1 commit
  18. 08 Nov, 2012 1 commit
  19. 03 Nov, 2012 1 commit
  20. 23 Oct, 2012 1 commit
    • Nicolas Dichtel's avatar
      ipv6: add support of equal cost multipath (ECMP) · 51ebd318
      Nicolas Dichtel authored
      Each nexthop is added like a single route in the routing table. All routes
      that have the same metric/weight and destination but not the same gateway
      are considering as ECMP routes. They are linked together, through a list called
      rt6i_siblings.
      
      ECMP routes can be added in one shot, with RTA_MULTIPATH attribute or one after
      the other (in both case, the flag NLM_F_EXCL should not be set).
      
      The patch is based on a previous work from
      Luc Saillard <luc.saillard@6wind.com>.
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51ebd318
  21. 18 Sep, 2012 1 commit
  22. 05 Sep, 2012 1 commit
  23. 05 Jul, 2012 1 commit
  24. 16 Jun, 2012 1 commit
  25. 15 Jun, 2012 1 commit
  26. 11 Jun, 2012 2 commits
  27. 17 Apr, 2012 2 commits
    • Jiri Bohac's avatar
      ipv6: clean up rt6_clean_expires · cda31e10
      Jiri Bohac authored
      Functionally, this change is a NOP.
      
      Semantically, rt6_clean_expires() wants to do rt->dst.from = NULL instead of
      rt->dst.expires = 0. It is clearing the RTF_EXPIRES flag, so the union is going
      to be treated as a pointer (dst.from) not a long (dst.expires).
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cda31e10
    • Jiri Bohac's avatar
      ipv6: fix rt6_update_expires · edfb5d46
      Jiri Bohac authored
      Commit 1716a961 (ipv6: fix problem with expired dst cache) broke PMTU
      discovery. rt6_update_expires() calls dst_set_expires(), which only updates
      dst->expires if it has not been set previously (expires == 0) or if the new
      expires is earlier than the current dst->expires.
      
      rt6_update_expires() needs to zero rt->dst.expires, otherwise it will contain
      ivalid data left over from rt->dst.from and will confuse dst_set_expires().
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edfb5d46
  28. 13 Apr, 2012 1 commit
    • Gao feng's avatar
      ipv6: fix problem with expired dst cache · 1716a961
      Gao feng authored
      If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
      this dst cache will not check expire because it has no RTF_EXPIRES flag.
      So this dst cache will always be used until the dst gc run.
      
      Change the struct dst_entry,add a union contains new pointer from and expires.
      When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
      we can use this field to point to where the dst cache copy from.
      The dst.from is only used in IPV6.
      
      rt6_check_expired check if rt6_info.dst.from is expired.
      
      ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
      and RTF_DEFAULT.then hold the ort.
      
      ip6_dst_destroy release the ort.
      
      Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
      and change the code to use these new adding functions.
      
      Changes from v5:
      modify ip6_route_add and ndisc_router_discovery to use new adding functions.
      
      Only set dst.from when the ort has flag RTF_ADDRCONF
      and RTF_DEFAULT.then hold the ort.
      Signed-off-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1716a961
  29. 30 Dec, 2011 1 commit
    • Josh Hunt's avatar
      IPv6: Avoid taking write lock for /proc/net/ipv6_route · 32b293a5
      Josh Hunt authored
      During some debugging I needed to look into how /proc/net/ipv6_route
      operated and in my digging I found its calling fib6_clean_all() which uses
      "write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
      found this on 2.6.32, but reading the code I believe the same basic idea
      exists currently. Looking at the rtnetlink code they are only calling
      "read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
      reading from proc isn't the recommended way of fetching the ipv6 route
      table; taking a write lock seems unnecessary and would probably cause
      network performance issues.
      
      To verify this I loaded up the ipv6 route table and then ran iperf in 3
      cases:
        * doing nothing
        * reading ipv6 route table via proc
          (while :; do cat /proc/net/ipv6_route > /dev/null; done)
        * reading ipv6 route table via rtnetlink
          (while :; do ip -6 route show table all > /dev/null; done)
      
      * Load the ipv6 route table up with:
        * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done
      
      * iperf commands:
        * client: iperf -i 1 -V -c <ipv6 addr>
        * server: iperf -V -s
      
      * iperf results - 3 runs each (in Mbits/sec)
        * nothing: client: 927,927,927 server: 927,927,927
        * proc: client: 179,97,96,113 server: 142,112,133
        * iproute: client: 928,927,928 server: 927,927,927
      
      lock_stat shows taking the write lock is causing the slowdown. Using this
      info I decided to write a version of fib6_clean_all() which replaces
      write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
      this new function I see the same results as with my rtnetlink iperf test.
      Signed-off-by: default avatarJosh Hunt <joshhunt00@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32b293a5
  30. 28 Dec, 2011 1 commit
  31. 18 Jul, 2011 1 commit
  32. 24 Apr, 2011 1 commit
  33. 22 Apr, 2011 1 commit