Skip to content
  • Eric Dumazet's avatar
    vxlan: keep original skb ownership · 8f646c92
    Eric Dumazet authored
    
    
    Sathya Perla posted a patch trying to address following problem :
    
    <quote>
     The vxlan driver sets itself as the socket owner for all the TX flows
     it encapsulates (using vxlan_set_owner()) and assigns it's own skb
     destructor. This causes all tunneled traffic to land up on only one TXQ
     as all encapsulated skbs refer to the vxlan socket and not the original
     socket.  Also, the vxlan skb destructor breaks some functionality for
     tunneled traffic like wmem accounting and as TCP small queues and
     FQ/pacing packet scheduler.
    </quote>
    
    I reworked Sathya patch and added some explanations.
    
    vxlan_xmit() can avoid one skb_clone()/dev_kfree_skb() pair
    and gain better drop monitor accuracy, by calling kfree_skb() when
    appropriate.
    
    The UDP socket used by vxlan to perform encapsulation of xmit packets
    do not need to be alive while packets leave vxlan code. Its better
    to keep original socket ownership to get proper feedback from qdisc and
    NIC layers.
    
    We use skb->sk to
    
    A) control amount of bytes/packets queued on behalf of a socket, but
    prior vxlan code did the skb->sk transfert without any limit/control
    on vxlan socket sk_sndbuf.
    
    B) security purposes (as selinux) or netfilter uses, and I do not think
    anything is prepared to handle vxlan stacked case in this area.
    
    By not changing ownership, vxlan tunnels behave like other tunnels.
    As Stephen mentioned, we might do the same change in L2TP.
    
    Reported-by: default avatarSathya Perla <sathya.perla@emulex.com>
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Cc: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    8f646c92