1. 25 Jun, 2016 2 commits
  2. 15 Jun, 2016 1 commit
  3. 09 Jun, 2016 1 commit
  4. 07 Jun, 2016 3 commits
  5. 16 May, 2016 1 commit
  6. 08 May, 2016 1 commit
    • Eric Dumazet's avatar
      fq_codel: add memory limitation per queue · 95b58430
      Eric Dumazet authored
      On small embedded routers, one wants to control maximal amount of
      memory used by fq_codel, instead of controlling number of packets or
      bytes, since GRO/TSO make these not practical.
      
      Assuming skb->truesize is accurate, we have to keep track of
      skb->truesize sum for skbs in queue.
      
      This patch adds a new TCA_FQ_CODEL_MEMORY_LIMIT attribute.
      
      I chose a default value of 32 MBytes, which looks reasonable even
      for heavy duty usages. (Prior fq_codel users should not be hurt
      when they upgrade their kernels)
      
      Two fields are added to tc_fq_codel_qd_stats to report :
       - Current memory usage
       - Number of drops caused by memory limits
      
      # tc qd replace dev eth1 root est 1sec 4sec fq_codel memory_limit 4M
      ..
      # tc -s -d qd sh dev eth1
      qdisc fq_codel 8008: root refcnt 257 limit 10240p flows 1024
       quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
       Sent 2083566791363 bytes 1376214889 pkt (dropped 4994406, overlimits 0
      requeues 21705223)
       rate 9841Mbit 812549pps backlog 3906120b 376p requeues 21705223
        maxpacket 68130 drop_overlimit 4994406 new_flow_count 28855414
        ecn_mark 0 memory_used 4190048 drop_overmemory 4994406
        new_flows_len 1 old_flows_len 177
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Dave Täht <dave.taht@gmail.com>
      Cc: Sebastian Möller <moeller0@gmx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95b58430
  7. 03 May, 2016 1 commit
    • Eric Dumazet's avatar
      fq_codel: add batch ability to fq_codel_drop() · 9d18562a
      Eric Dumazet authored
      In presence of inelastic flows and stress, we can call
      fq_codel_drop() for every packet entering fq_codel qdisc.
      
      fq_codel_drop() is quite expensive, as it does a linear scan
      of 4 KB of memory to find a fat flow.
      Once found, it drops the oldest packet of this flow.
      
      Instead of dropping a single packet, try to drop 50% of the backlog
      of this fat flow, with a configurable limit of 64 packets per round.
      
      TCA_FQ_CODEL_DROP_BATCH_SIZE is the new attribute to make this
      limit configurable.
      
      With this strategy the 4 KB search is amortized to a single cache line
      per drop [1], so fq_codel_drop() no longer appears at the top of kernel
      profile in presence of few inelastic flows.
      
      [1] Assuming a 64byte cache line, and 1024 buckets
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDave Taht <dave.taht@gmail.com>
      Cc: Jonathan Morton <chromatix99@gmail.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: Dave Taht
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d18562a
  8. 25 Apr, 2016 2 commits
  9. 29 Feb, 2016 1 commit
  10. 27 Aug, 2015 1 commit
    • Daniel Borkmann's avatar
      net: sched: consolidate tc_classify{,_compat} · 3b3ae880
      Daniel Borkmann authored
      For classifiers getting invoked via tc_classify(), we always need an
      extra function call into tc_classify_compat(), as both are being
      exported as symbols and tc_classify() itself doesn't do much except
      handling of reclassifications when tp->classify() returned with
      TC_ACT_RECLASSIFY.
      
      CBQ and ATM are the only qdiscs that directly call into tc_classify_compat(),
      all others use tc_classify(). When tc actions are being configured
      out in the kernel, tc_classify() effectively does nothing besides
      delegating.
      
      We could spare this layer and consolidate both functions. pktgen on
      single CPU constantly pushing skbs directly into the netif_receive_skb()
      path with a dummy classifier on ingress qdisc attached, improves
      slightly from 22.3Mpps to 23.1Mpps.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b3ae880
  11. 02 Aug, 2015 1 commit
    • Eric Dumazet's avatar
      fq_codel: explicitly reset flows in ->reset() · 3d0e0af4
      Eric Dumazet authored
      Alex reported the following crash when using fq_codel
      with htb:
      
        crash> bt
        PID: 630839  TASK: ffff8823c990d280  CPU: 14  COMMAND: "tc"
         [... snip ...]
         #8 [ffff8820ceec17a0] page_fault at ffffffff8160a8c2
            [exception RIP: htb_qlen_notify+24]
            RIP: ffffffffa0841718  RSP: ffff8820ceec1858  RFLAGS: 00010282
            RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffff88241747b400
            RDX: ffff88241747b408  RSI: 0000000000000000  RDI: ffff8811fb27d000
            RBP: ffff8820ceec1868   R8: ffff88120cdeff24   R9: ffff88120cdeff30
            R10: 0000000000000bd4  R11: ffffffffa0840919  R12: ffffffffa0843340
            R13: 0000000000000000  R14: 0000000000000001  R15: ffff8808dae5c2e8
            ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
         #9 [...] qdisc_tree_decrease_qlen at ffffffff81565375
        #10 [...] fq_codel_dequeue at ffffffffa084e0a0 [sch_fq_codel]
        #11 [...] fq_codel_reset at ffffffffa084e2f8 [sch_fq_codel]
        #12 [...] qdisc_destroy at ffffffff81560d2d
        #13 [...] htb_destroy_class at ffffffffa08408f8 [sch_htb]
        #14 [...] htb_put at ffffffffa084095c [sch_htb]
        #15 [...] tc_ctl_tclass at ffffffff815645a3
        #16 [...] rtnetlink_rcv_msg at ffffffff81552cb0
        [... snip ...]
      
      As Jamal pointed out, there is actually no need to call dequeue
      to purge the queued skb's in reset, data structures can be just
      reset explicitly. Therefore, we reset everything except config's
      and stats, so that we would have a fresh start after device flipping.
      
      Fixes: 4b549a2e ("fq_codel: Fair Queue Codel AQM")
      Reported-by: default avatarAlex Gartrell <agartrell@fb.com>
      Cc: Alex Gartrell <agartrell@fb.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      [xiyou.wangcong@gmail.com: added codel_vars_init() and qdisc_qstats_backlog_dec()]
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d0e0af4
  12. 15 Jul, 2015 2 commits
  13. 10 May, 2015 1 commit
    • Eric Dumazet's avatar
      codel: add ce_threshold attribute · 80ba92fa
      Eric Dumazet authored
      For DCTCP or similar ECN based deployments on fabrics with shallow
      buffers, hosts are responsible for a good part of the buffering.
      
      This patch adds an optional ce_threshold to codel & fq_codel qdiscs,
      so that DCTCP can have feedback from queuing in the host.
      
      A DCTCP enabled egress port simply have a queue occupancy threshold
      above which ECT packets get CE mark.
      
      In codel language this translates to a sojourn time, so that one doesn't
      have to worry about bytes or bandwidth but delays.
      
      This makes the host an active participant in the health of the whole
      network.
      
      This also helps experimenting DCTCP in a setup without DCTCP compliant
      fabric.
      
      On following example, ce_threshold is set to 1ms, and we can see from
      'ldelay xxx us' that TCP is not trying to go around the 5ms codel
      target.
      
      Queue has more capacity to absorb inelastic bursts (say from UDP
      traffic), as queues are maintained to an optimal level.
      
      lpaa23:~# ./tc -s -d qd sh dev eth1
      qdisc mq 1: dev eth1 root
       Sent 87910654696 bytes 58065331 pkt (dropped 0, overlimits 0 requeues 42961)
       backlog 3108242b 364p requeues 42961
      qdisc codel 8063: dev eth1 parent 1:1 limit 1000p target 5.0ms ce_threshold 1.0ms interval 100.0ms
       Sent 7363778701 bytes 4863809 pkt (dropped 0, overlimits 0 requeues 5503)
       rate 2348Mbit 193919pps backlog 255866b 46p requeues 5503
        count 0 lastcount 0 ldelay 1.0ms drop_next 0us
        maxpacket 68130 ecn_mark 0 drop_overlimit 0 ce_mark 72384
      qdisc codel 8064: dev eth1 parent 1:2 limit 1000p target 5.0ms ce_threshold 1.0ms interval 100.0ms
       Sent 7636486190 bytes 5043942 pkt (dropped 0, overlimits 0 requeues 5186)
       rate 2319Mbit 191538pps backlog 207418b 64p requeues 5186
        count 0 lastcount 0 ldelay 694us drop_next 0us
        maxpacket 68130 ecn_mark 0 drop_overlimit 0 ce_mark 69873
      qdisc codel 8065: dev eth1 parent 1:3 limit 1000p target 5.0ms ce_threshold 1.0ms interval 100.0ms
       Sent 11569360142 bytes 7641602 pkt (dropped 0, overlimits 0 requeues 5554)
       rate 3041Mbit 251096pps backlog 210446b 59p requeues 5554
        count 0 lastcount 0 ldelay 889us drop_next 0us
        maxpacket 68130 ecn_mark 0 drop_overlimit 0 ce_mark 37780
      ...
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Glenn Judd <glenn.judd@morganstanley.com>
      Cc: Nandita Dukkipati <nanditad@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80ba92fa
  14. 03 May, 2015 2 commits
    • Tom Herbert's avatar
      sched: Call skb_get_hash_perturb in sch_fq_codel · 342db221
      Tom Herbert authored
      Call skb_get_hash_perturb instead of doing skb_flow_dissect and then
      jhash by hand.
      Signed-off-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      342db221
    • Eric Dumazet's avatar
      codel: fix maxpacket/mtu confusion · a5d28090
      Eric Dumazet authored
      Under presence of TSO/GSO/GRO packets, codel at low rates can be quite
      useless. In following example, not a single packet was ever dropped,
      while average delay in codel queue is ~100 ms !
      
      qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
       Sent 134376498 bytes 88797 pkt (dropped 0, overlimits 0 requeues 0)
       backlog 13626b 3p requeues 0
        count 0 lastcount 0 ldelay 96.9ms drop_next 0us
        maxpacket 9084 ecn_mark 0 drop_overlimit 0
      
      This comes from a confusion of what should be the minimal backlog. It is
      pretty clear it is not 64KB or whatever max GSO packet ever reached the
      qdisc.
      
      codel intent was to use MTU of the device.
      
      After the fix, we finally drop some packets, and rtt/cwnd of my single
      TCP flow are meeting our expectations.
      
      qdisc codel 0: parent 1:12 limit 16000p target 5.0ms interval 100.0ms
       Sent 102798497 bytes 67912 pkt (dropped 1365, overlimits 0 requeues 0)
       backlog 6056b 3p requeues 0
        count 1 lastcount 1 ldelay 36.3ms drop_next 0us
        maxpacket 10598 ecn_mark 0 drop_overlimit 0
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Dave Taht <dave.taht@gmail.com>
      Cc: Van Jacobson <vanj@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5d28090
  15. 09 Dec, 2014 1 commit
  16. 29 Sep, 2014 3 commits
  17. 13 Sep, 2014 2 commits
  18. 23 Aug, 2014 1 commit
  19. 05 Jun, 2014 1 commit
  20. 13 Mar, 2014 1 commit
  21. 14 Jan, 2014 1 commit
  22. 29 Mar, 2013 1 commit
  23. 03 Sep, 2012 1 commit
  24. 16 May, 2012 1 commit
    • Eric Dumazet's avatar
      fq_codel: should use qdisc backlog as threshold · 865ec552
      Eric Dumazet authored
      codel_should_drop() logic allows a packet being not dropped if queue
      size is under max packet size.
      
      In fq_codel, we have two possible backlogs : The qdisc global one, and
      the flow local one.
      
      The meaningful one for codel_should_drop() should be the global backlog,
      not the per flow one, so that thin flows can have a non zero drop/mark
      probability.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Dave Taht <dave.taht@bufferbloat.net>
      Cc: Kathleen Nichols <nichols@pollere.com>
      Cc: Van Jacobson <van@pollere.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      865ec552
  25. 14 May, 2012 1 commit
    • Sasha Levin's avatar
      net: codel: fix build errors · 669d67bf
      Sasha Levin authored
      Fix the following build error:
      
      net/sched/sch_fq_codel.c: In function 'fq_codel_dump_stats':
      net/sched/sch_fq_codel.c:464:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:464:3: warning: missing braces around initializer
      net/sched/sch_fq_codel.c:464:3: warning: (near initialization for 'st.<anonymous>')
      net/sched/sch_fq_codel.c:465:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:465:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:465:3: warning: (near initialization for 'st')
      net/sched/sch_fq_codel.c:466:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:466:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:466:3: warning: (near initialization for 'st')
      net/sched/sch_fq_codel.c:467:3: error: unknown field 'qdisc_stats' specified in initializer
      net/sched/sch_fq_codel.c:467:3: warning: excess elements in struct initializer
      net/sched/sch_fq_codel.c:467:3: warning: (near initialization for 'st')
      make[1]: *** [net/sched/sch_fq_codel.o] Error 1
      Signed-off-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      669d67bf
  26. 12 May, 2012 1 commit
    • Eric Dumazet's avatar
      fq_codel: Fair Queue Codel AQM · 4b549a2e
      Eric Dumazet authored
      Fair Queue Codel packet scheduler
      
      Principles :
      
      - Packets are classified (internal classifier or external) on flows.
      - This is a Stochastic model (as we use a hash, several flows might
                                    be hashed on same slot)
      - Each flow has a CoDel managed queue.
      - Flows are linked onto two (Round Robin) lists,
        so that new flows have priority on old ones.
      
      - For a given flow, packets are not reordered (CoDel uses a FIFO)
      - head drops only.
      - ECN capability is on by default.
      - Very low memory footprint (64 bytes per flow)
      
      tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                            [ target TIME ] [ interval TIME ] [ noecn ]
                            [ quantum BYTES ]
      
      defaults : 1024 flows, 10240 packets limit, quantum : device MTU
                 target : 5ms (CoDel default)
                 interval : 100ms (CoDel default)
      
      Impressive results on load :
      
      class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
       Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
       rate 201691Kbit 28595pps backlog 0b 312p requeues 0
       lended: 33063109 borrowed: 0 giants: 0
       tokens: -912 ctokens: -912
      
      class fq_codel 10:1735 parent 10:
       (dropped 1292, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4524 parent 10:
       (dropped 1291, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:4e74 parent 10:
       (dropped 1290, overlimits 0 requeues 0)
       backlog 6056b 4p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
      class fq_codel 10:628a parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 7570b 5p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
      class fq_codel 10:a4b3 parent 10:
       (dropped 302, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:c3c2 parent 10:
       (dropped 1284, overlimits 0 requeues 0)
       backlog 13626b 9p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:d331 parent 10:
       (dropped 299, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.0ms
      class fq_codel 10:d526 parent 10:
       (dropped 12160, overlimits 0 requeues 0)
       backlog 35870b 211p requeues 0
        deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
      class fq_codel 10:e2c6 parent 10:
       (dropped 1288, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      class fq_codel 10:eab5 parent 10:
       (dropped 1285, overlimits 0 requeues 0)
       backlog 16654b 11p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 5.9ms
      class fq_codel 10:f220 parent 10:
       (dropped 1289, overlimits 0 requeues 0)
       backlog 15140b 10p requeues 0
        deficit 1514 count 1 lastcount 1 ldelay 7.1ms
      
      qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
       Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
       rate 201697Kbit 28602pps backlog 0b 260p requeues 71
      qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
       Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
       rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
        maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
        new_flows_len 0 old_flows_len 11
      
      PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
      64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
      64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
      64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
      64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
      64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
      64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
      64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
      64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
      64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
      64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms
      
      10 packets transmitted, 10 received, 0% packet loss, time 8999ms
      rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms
      
      Much better than SFQ because of priority given to new flows, and fast
      path dirtying less cache lines.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b549a2e