Skip to content
  • Josh Hunt's avatar
    IPv6: Avoid taking write lock for /proc/net/ipv6_route · 32b293a5
    Josh Hunt authored
    
    
    During some debugging I needed to look into how /proc/net/ipv6_route
    operated and in my digging I found its calling fib6_clean_all() which uses
    "write_lock_bh(&table->tb6_lock)" before doing the walk of the table. I
    found this on 2.6.32, but reading the code I believe the same basic idea
    exists currently. Looking at the rtnetlink code they are only calling
    "read_lock_bh(&table->tb6_lock);" via fib6_dump_table(). While I realize
    reading from proc isn't the recommended way of fetching the ipv6 route
    table; taking a write lock seems unnecessary and would probably cause
    network performance issues.
    
    To verify this I loaded up the ipv6 route table and then ran iperf in 3
    cases:
      * doing nothing
      * reading ipv6 route table via proc
        (while :; do cat /proc/net/ipv6_route > /dev/null; done)
      * reading ipv6 route table via rtnetlink
        (while :; do ip -6 route show table all > /dev/null; done)
    
    * Load the ipv6 route table up with:
      * for ((i = 0;i < 4000;i++)); do ip route add unreachable 2000::$i; done
    
    * iperf commands:
      * client: iperf -i 1 -V -c <ipv6 addr>
      * server: iperf -V -s
    
    * iperf results - 3 runs each (in Mbits/sec)
      * nothing: client: 927,927,927 server: 927,927,927
      * proc: client: 179,97,96,113 server: 142,112,133
      * iproute: client: 928,927,928 server: 927,927,927
    
    lock_stat shows taking the write lock is causing the slowdown. Using this
    info I decided to write a version of fib6_clean_all() which replaces
    write_lock_bh(&table->tb6_lock) with read_lock_bh(&table->tb6_lock). With
    this new function I see the same results as with my rtnetlink iperf test.
    
    Signed-off-by: default avatarJosh Hunt <joshhunt00@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    32b293a5