1. 10 Mar, 2016 2 commits
  2. 01 Mar, 2016 1 commit
    • Jamal Hadi Salim's avatar
      introduce IFE action · ef6980b6
      Jamal Hadi Salim authored
      This action allows for a sending side to encapsulate arbitrary metadata
      which is decapsulated by the receiving end.
      The sender runs in encoding mode and the receiver in decode mode.
      Both sender and receiver must specify the same ethertype.
      At some point we hope to have a registered ethertype and we'll
      then provide a default so the user doesnt have to specify it.
      For now we enforce the user specify it.
      
      Lets show example usage where we encode icmp from a sender towards
      a receiver with an skbmark of 17; both sender and receiver use
      ethertype of 0xdead to interop.
      
      YYYY: Lets start with Receiver-side policy config:
      xxx: add an ingress qdisc
      sudo tc qdisc add dev $ETH ingress
      
      xxx: any packets with ethertype 0xdead will be subjected to ife decoding
      xxx: we then restart the classification so we can match on icmp at prio 3
      sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xdead \
      u32 match u32 0 0 flowid 1:1 \
      action ife decode reclassify
      
      xxx: on restarting the classification from above if it was an icmp
      xxx: packet, then match it here and continue to the next rule at prio 4
      xxx: which will match based on skb mark of 17
      sudo tc filter add dev $ETH parent ffff: prio 3 protocol ip \
      u32 match ip protocol 1 0xff flowid 1:1 \
      action continue
      
      xxx: match on skbmark of 0x11 (decimal 17) and accept
      sudo tc filter add dev $ETH parent ffff: prio 4 protocol ip \
      handle 0x11 fw flowid 1:1 \
      action ok
      
      xxx: Lets show the decoding policy
      sudo tc -s filter ls dev $ETH parent ffff: protocol 0xdead
      xxx:
      filter pref 2 u32
      filter pref 2 u32 fh 800: ht divisor 1
      filter pref 2 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1  (rule hit 0 success 0)
        match 00000000/00000000 at 0 (success 0 )
              action order 1: ife decode action reclassify
               index 1 ref 1 bind 1 installed 14 sec used 14 sec
               type: 0x0
               Metadata: allow mark allow hash allow prio allow qmap
              Action statistics:
              Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0
      xxx:
      Observe that above lists all metadatum it can decode. Typically these
      submodules will already be compiled into a monolithic kernel or
      loaded as modules
      
      YYYY: Lets show the sender side now ..
      
      xxx: Add an egress qdisc on the sender netdev
      sudo tc qdisc add dev $ETH root handle 1: prio
      xxx:
      xxx: Match all icmp packets to 192.168.122.237/24, then
      xxx: tag the packet with skb mark of decimal 17, then
      xxx: Encode it with:
      xxx:	ethertype 0xdead
      xxx:	add skb->mark to whitelist of metadatum to send
      xxx:	rewrite target dst MAC address to 02:15:15:15:15:15
      xxx:
      sudo $TC filter add dev $ETH parent 1: protocol ip prio 10  u32 \
      match ip dst 192.168.122.237/24 \
      match ip protocol 1 0xff \
      flowid 1:2 \
      action skbedit mark 17 \
      action ife encode \
      type 0xDEAD \
      allow mark \
      dst 02:15:15:15:15:15
      
      xxx: Lets show the encoding policy
      sudo tc -s filter ls dev $ETH parent 1: protocol ip
      xxx:
      filter pref 10 u32
      filter pref 10 u32 fh 800: ht divisor 1
      filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2  (rule hit 0 success 0)
        match c0a87aed/ffffffff at 16 (success 0 )
        match 00010000/00ff0000 at 8 (success 0 )
      
      	action order 1:  skbedit mark 17
      	 index 6 ref 1 bind 1
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      	action order 2: ife encode action pipe
      	 index 3 ref 1 bind 1
      	 dst MAC: 02:15:15:15:15:15 type: 0xDEAD
       	 Metadata: allow mark
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      xxx:
      
      test by sending ping from sender to destination
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef6980b6
  3. 17 Feb, 2016 1 commit
  4. 18 Sep, 2015 1 commit
  5. 26 Aug, 2015 1 commit
    • Alexei Starovoitov's avatar
      net_sched: act_bpf: remove spinlock in fast path · cff82457
      Alexei Starovoitov authored
      Similar to act_gact/act_mirred, act_bpf can be lockless in packet processing
      with extra care taken to free bpf programs after rcu grace period.
      Replacement of existing act_bpf (very rare) is done with synchronize_rcu()
      and final destruction is done from tc_action_ops->cleanup() callback that is
      called from tcf_exts_destroy()->tcf_action_destroy()->__tcf_hash_release() when
      bind and refcnt reach zero which is only possible when classifier is destroyed.
      Previous two patches fixed the last two classifiers (tcindex and rsvp) to
      call tcf_exts_destroy() from rcu callback.
      
      Similar to gact/mirred there is a race between prog->filter and
      prog->tcf_action. Meaning that the program being replaced may use
      previous default action if it happened to return TC_ACT_UNSPEC.
      act_mirred race betwen tcf_action and tcfm_dev is similar.
      In all cases the race is harmless.
      Long term we may want to improve the situation by replacing the whole
      tc_action->priv as single pointer instead of updating inner fields one by one.
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cff82457
  6. 08 Jul, 2015 2 commits
  7. 20 Mar, 2015 1 commit
    • Daniel Borkmann's avatar
      act_bpf: add initial eBPF support for actions · a8cb5f55
      Daniel Borkmann authored
      This work extends the "classic" BPF programmable tc action by extending
      its scope also to native eBPF code!
      
      Together with commit e2e9b654 ("cls_bpf: add initial eBPF support
      for programmable classifiers") this adds the facility to implement fully
      flexible classifier and actions for tc that can be implemented in a C
      subset in user space, "safely" loaded into the kernel, and being run in
      native speed when JITed.
      
      Also, since eBPF maps can be shared between eBPF programs, it offers the
      possibility that cls_bpf and act_bpf can share data 1) between themselves
      and 2) between user space applications. That means that, f.e. customized
      runtime statistics can be collected in user space, but also more importantly
      classifier and action behaviour could be altered based on map input from
      the user space application.
      
      For the remaining details on the workflow and integration, see the cls_bpf
      commit e2e9b654. Preliminary iproute2 part can be found under [1].
      
        [1] http://git.breakpoint.cc/cgit/dborkman/iproute2.git/log/?h=ebpf-actSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8cb5f55
  8. 19 Jan, 2015 1 commit
    • Felix Fietkau's avatar
      net: sched: Introduce connmark action · 22a5dc0e
      Felix Fietkau authored
      This tc action allows you to retrieve the connection tracking mark
      This action has been used heavily by openwrt for a few years now.
      
      There are known limitations currently:
      
      doesn't work for initial packets, since we only query the ct table.
        Fine given use case is for returning packets
      
      no implicit defrag.
        frags should be rare so fix later..
      
      won't work for more complex tasks, e.g. lookup of other extensions
        since we have no means to store results
      
      we still have a 2nd lookup later on via normal conntrack path.
      This shouldn't break anything though since skb->nfct isn't altered.
      
      V2:
      remove unnecessary braces (Jiri)
      change the action identifier to 14 (Jiri)
      Fix some stylistic issues caught by checkpatch
      V3:
      Move module params to bottom (Cong)
      Get rid of tcf_hashinfo_init and friends and conform to newer API (Cong)
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22a5dc0e
  9. 17 Jan, 2015 1 commit
  10. 24 Nov, 2014 1 commit
  11. 21 Nov, 2014 1 commit
  12. 12 Feb, 2014 1 commit
  13. 06 Dec, 2013 1 commit
  14. 20 Aug, 2010 1 commit
    • Grégoire Baron's avatar
      net/sched: add ACT_CSUM action to update packets checksums · eb4d4065
      Grégoire Baron authored
      net/sched: add ACT_CSUM action to update packets checksums
      
      ACT_CSUM can be called just after ACT_PEDIT in order to re-compute some
      altered checksums in IPv4 and IPv6 packets. The following checksums are
      supported by this patch:
       - IPv4: IPv4 header, ICMP, IGMP, TCP, UDP & UDPLite
       - IPv6: ICMPv6, TCP, UDP & UDPLite
      It's possible to request in the same action to update different kind of
      checksums, if the packets flow mix TCP, UDP and UDPLite, ...
      
      An example of usage is done in the associated iproute2 patch.
      
      Version 3 changes:
       - remove useless goto instructions
       - improve IPv6 hop options decoding
      
      Version 2 changes:
       - coding style correction
       - remove useless arguments of some functions
       - use stack in tcf_csum_dump()
       - add tcf_csum_skb_nextlayer() to factor code
      Signed-off-by: default avatarGregoire Baron <baronchon@n7mm.org>
      Acked-by: default avatarjamal <hadi@cyberus.ca>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb4d4065
  15. 24 Jul, 2010 1 commit
  16. 22 Oct, 2009 1 commit
  17. 12 Sep, 2008 1 commit
  18. 10 Oct, 2007 1 commit
    • Herbert Xu's avatar
      [PKT_SCHED]: Add stateless NAT · b4219952
      Herbert Xu authored
      Stateless NAT is useful in controlled environments where restrictions are
      placed on through traffic such that we don't need connection tracking to
      correctly NAT protocol-specific data.
      
      In particular, this is of interest when the number of flows or the number
      of addresses being NATed is large, or if connection tracking information
      has to be replicated and where it is not practical to do so.
      
      Previously we had stateless NAT functionality which was integrated into
      the IPv4 routing subsystem.  This was a great solution as long as the NAT
      worked on a subnet to subnet basis such that the number of NAT rules was
      relatively small.  The reason is that for SNAT the routing based system
      had to perform a linear scan through the rules.
      
      If the number of rules is large then major renovations would have take
      place in the routing subsystem to make this practical.
      
      For the time being, the least intrusive way of achieving this is to use
      the u32 classifier written by Alexey Kuznetsov along with the actions
      infrastructure implemented by Jamal Hadi Salim.
      
      The following patch is an attempt at this problem by creating a new nat
      action that can be invoked from u32 hash tables which would allow large
      number of stateless NAT rules that can be used/updated in constant time.
      
      The actual NAT code is mostly based on the previous stateless NAT code
      written by Alexey.  In future we might be able to utilise the protocol
      NAT code from netfilter to improve support for other protocols.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4219952
  19. 22 Sep, 2006 1 commit
    • David S. Miller's avatar
      [PKT_SCHED]: Kill pkt_act.h inlining. · e9ce1cd3
      David S. Miller authored
      This was simply making templates of functions and mostly causing a lot
      of code duplication in the classifier action modules.
      
      We solve this more cleanly by having a common "struct tcf_common" that
      hash worker functions contained once in act_api.c can work with.
      
      Callers work with real action objects that have the common struct
      plus their module specific struct members.  You go from a common
      object to the higher level one using a "to_foo()" macro which makes
      use of container_of() to do the dirty work.
      
      This also kills off act_generic.h which was only used by act_simple.c
      and keeping it around was more work than the it's value.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9ce1cd3
  20. 22 Mar, 2006 1 commit
  21. 24 Apr, 2005 1 commit
  22. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4