1. 25 Apr, 2007 6 commits
    • Eric Dumazet's avatar
      [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support · 92f37fd2
      Eric Dumazet authored
      Now that network timestamps use ktime_t infrastructure, we can add a new
      This command is similar to SO_TIMESTAMP, but permits transmission of
      a 'timespec struct' instead of a 'timeval struct' control message.
      (nanosecond resolution instead of microsecond)
      Control message is labelled SCM_TIMESTAMPNS instead of SCM_TIMESTAMP
      A socket cannot mix SO_TIMESTAMP and SO_TIMESTAMPNS : the two modes are
      mutually exclusive.
      sock_recv_timestamp() became too big to be fully inlined so I added a
      __sock_recv_timestamp() helper function.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      CC: linux-arch@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Stephen Hemminger's avatar
      [NET]: Replace CONFIG_NET_DEBUG with sysctl. · a2a316fd
      Stephen Hemminger authored
      Covert network warning messages from a compile time to runtime choice.
      Removes kernel config option and replaces it with new /proc/sys/net/core/warnings.
      Signed-off-by: default avatarStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      [NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution · ae40eb1e
      Eric Dumazet authored
      Now network timestamps use ktime_t infrastructure, we can add a new
      ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
      User programs can thus access to nanosecond resolution.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      CC: Stephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David S. Miller's avatar
      [TCP]: Abstract out all write queue operations. · fe067e8a
      David S. Miller authored
      This allows the write queue implementation to be changed,
      for example, to one which allows fast interval searching.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      [NET]: convert network timestamps to ktime_t · b7aa0bf7
      Eric Dumazet authored
      We currently use a special structure (struct skb_timeval) and plain
      'struct timeval' to store packet timestamps in sk_buffs and struct
      This has some drawbacks :
      - Fixed resolution of micro second.
      - Waste of space on 64bit platforms where sizeof(struct timeval)=16
      I suggest using ktime_t that is a nice abstraction of high resolution
      time services, currently capable of nanosecond resolution.
      As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
      a 8 byte shrink of this structure on 64bit architectures. Some other
      structures also benefit from this size reduction (struct ipq in
      ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)
      Once this ktime infrastructure adopted, we can more easily provide
      nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
      Note : this patch includes a bug correction in
      compat_sock_get_timestamp() where a "err = 0;" was missing (so this
      syscall returned -ENOENT instead of 0)
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      CC: Stephen Hemminger <shemminger@linux-foundation.org>
      CC: John find <linux.kernel@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      [NET]: Keep sk_backlog near sk_lock · fa438ccf
      Eric Dumazet authored
      sk_backlog is a critical field of struct sock. (known famous words)
      It is (ab)used in hot paths, in particular in release_sock(), tcp_recvmsg(),
      tcp_v4_rcv(), sk_receive_skb().
      It really makes sense to place it next to sk_lock, because sk_backlog is only
      used after sk_lock locked (and thus memory cache line in L1 cache). This
      should reduce cache misses and sk_lock acquisition time.
      (In theory, we could only move the head pointer near sk_lock, and leaving tail
      far away, because 'tail' is normally not so hot, but keep it simple :) )
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  2. 06 Mar, 2007 1 commit
  3. 02 Mar, 2007 1 commit
    • Wei Dong's avatar
      [NET]: Fix bugs in "Whether sock accept queue is full" checking · 8488df89
      Wei Dong authored
      	when I use linux TCP socket, and find there is a bug in function
      	When a new SYN comes, TCP module first checks its validation. If valid,
      send SYN,ACK to the client and add the sock to the syn hash table. Next
      time if received the valid ACK for SYN,ACK from the client. server will
      accept this connection and increase the sk->sk_ack_backlog -- which is
      done in function tcp_check_req().We check wether acceptq is full in
      function tcp_v4_syn_recv_sock().
      Consider an example:
       After listen(sockfd, 1) system call, sk->sk_max_ack_backlog is set to
      1. As we know, sk->sk_ack_backlog is initialized to 0. Assuming accept()
      system call is not invoked now.
      1. 1st connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=0 sk->sk_max_ack_backlog=1, function return 0 accept
      this connection. Increase the sk->sk_ack_backlog
      2. 2nd connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=1 sk->sk_max_ack_backlog=1, function return 0 accept
      this connection. Increase the sk->sk_ack_backlog
      3. 3rd connection comes. invoke sk_acceptq_is_full(). sk-
      >sk_ack_backlog=2 sk->sk_max_ack_backlog=1, function return 1. Refuse
      this connection.
      I think it has bugs. after listen system call. sk->sk_max_ack_backlog=1
      but now it can accept 2 connections.
      Signed-off-by: default avatarWei Dong <weid@np.css.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  4. 28 Feb, 2007 1 commit
    • Patrick McHardy's avatar
      [NET]: Handle disabled preemption in gfp_any() · 4498121c
      Patrick McHardy authored
      ctnetlink uses netlink_unicast from an atomic_notifier_chain
      (which is called within a RCU read side critical section)
      without holding further locks. netlink_unicast calls netlink_trim
      with the result of gfp_any() for the gfp flags, which are passed
      down to pskb_expand_header. gfp_any() only checks for softirq
      context and returns GFP_KERNEL, resulting in this warning:
      BUG: sleeping function called from invalid context at mm/slab.c:3032
      in_atomic():1, irqs_disabled():0
      no locks held by rmmod/7010.
      Call Trace:
       [<ffffffff8109467f>] debug_show_held_locks+0x9/0xb
       [<ffffffff8100b0b4>] __might_sleep+0xd9/0xdb
       [<ffffffff810b5082>] __kmalloc+0x68/0x110
       [<ffffffff811ba8f2>] pskb_expand_head+0x4d/0x13b
       [<ffffffff81053147>] netlink_broadcast+0xa5/0x2e0
       [<ffffffff881cd1d7>] :nfnetlink:nfnetlink_send+0x83/0x8a
       [<ffffffff8834f6a6>] :nf_conntrack_netlink:ctnetlink_conntrack_event+0x94c/0x96a
       [<ffffffff810624d6>] notifier_call_chain+0x29/0x3e
       [<ffffffff8106251d>] atomic_notifier_call_chain+0x32/0x60
       [<ffffffff881d266d>] :nf_conntrack:destroy_conntrack+0xa5/0x1d3
       [<ffffffff881d194e>] :nf_conntrack:nf_ct_cleanup+0x8c/0x12c
       [<ffffffff881d4614>] :nf_conntrack:kill_l3proto+0x0/0x13
       [<ffffffff881d482a>] :nf_conntrack:nf_conntrack_l3proto_unregister+0x90/0x94
       [<ffffffff883551b3>] :nf_conntrack_ipv4:nf_conntrack_l3proto_ipv4_fini+0x2b/0x5d
       [<ffffffff8109d44f>] sys_delete_module+0x1b5/0x1e6
       [<ffffffff8105f245>] trace_hardirqs_on_thunk+0x35/0x37
       [<ffffffff8105911e>] system_call+0x7e/0x83
      Since netlink_unicast is supposed to be callable from within RCU
      read side critical sections, make gfp_any() check for in_atomic()
      instead of in_softirq().
      Additionally nfnetlink_send needs to use gfp_any() as well for the
      call to netlink_broadcast).
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  5. 07 Dec, 2006 2 commits
  6. 04 Dec, 2006 1 commit
  7. 02 Dec, 2006 3 commits
  8. 25 Nov, 2006 1 commit
  9. 22 Oct, 2006 1 commit
  10. 01 Oct, 2006 1 commit
  11. 22 Sep, 2006 3 commits
  12. 03 Jul, 2006 3 commits
  13. 30 Jun, 2006 1 commit
  14. 29 Jun, 2006 1 commit
    • Michael Chan's avatar
      [NET]: Add ECN support for TSO · b0da8537
      Michael Chan authored
      In the current TSO implementation, NETIF_F_TSO and ECN cannot be
      turned on together in a TCP connection.  The problem is that most
      hardware that supports TSO does not handle CWR correctly if it is set
      in the TSO packet.  Correct handling requires CWR to be set in the
      first packet only if it is set in the TSO header.
      This patch adds the ability to turn on NETIF_F_TSO and ECN using
      GSO if necessary to handle TSO packets with CWR set.  Hardware
      that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
      features flag.
      All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set.  If
      the output device does not have the NETIF_F_TSO_ECN feature set, GSO
      will split the packet up correctly with CWR only set in the first
      With help from Herbert Xu <herbert@gondor.apana.org.au>.
      Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
      flag is completely removed.
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  15. 23 Jun, 2006 2 commits
  16. 17 Jun, 2006 3 commits
  17. 29 Apr, 2006 1 commit
  18. 26 Apr, 2006 1 commit
  19. 20 Apr, 2006 1 commit
    • David S. Miller's avatar
      [NET]: Add skb->truesize assertion checking. · dc6de336
      David S. Miller authored
      Add some sanity checking.  truesize should be at least sizeof(struct
      sk_buff) plus the current packet length.  If not, then truesize is
      seriously mangled and deserves a kernel log message.
      Currently we'll do the check for release of stream socket buffers.
      But we can add checks to more spots over time.
      Incorporating ideas from Herbert Xu.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  20. 28 Mar, 2006 1 commit
    • Denis Vlasenko's avatar
      [NET]: deinline 200+ byte inlines in sock.h · f0088a50
      Denis Vlasenko authored
      Sizes in bytes (allyesconfig, i386) and files where those inlines
      are used:
      238 sock_queue_rcv_skb 2.6.16/net/x25/x25_in.o
      238 sock_queue_rcv_skb 2.6.16/net/rose/rose_in.o
      238 sock_queue_rcv_skb 2.6.16/net/packet/af_packet.o
      238 sock_queue_rcv_skb 2.6.16/net/netrom/nr_in.o
      238 sock_queue_rcv_skb 2.6.16/net/llc/llc_sap.o
      238 sock_queue_rcv_skb 2.6.16/net/llc/llc_conn.o
      238 sock_queue_rcv_skb 2.6.16/net/irda/af_irda.o
      238 sock_queue_rcv_skb 2.6.16/net/ipx/af_ipx.o
      238 sock_queue_rcv_skb 2.6.16/net/ipv6/udp.o
      238 sock_queue_rcv_skb 2.6.16/net/ipv6/raw.o
      238 sock_queue_rcv_skb 2.6.16/net/ipv4/udp.o
      238 sock_queue_rcv_skb 2.6.16/net/ipv4/raw.o
      238 sock_queue_rcv_skb 2.6.16/net/ipv4/ipmr.o
      238 sock_queue_rcv_skb 2.6.16/net/econet/econet.o
      238 sock_queue_rcv_skb 2.6.16/net/econet/af_econet.o
      238 sock_queue_rcv_skb 2.6.16/net/bluetooth/sco.o
      238 sock_queue_rcv_skb 2.6.16/net/bluetooth/l2cap.o
      238 sock_queue_rcv_skb 2.6.16/net/bluetooth/hci_sock.o
      238 sock_queue_rcv_skb 2.6.16/net/ax25/ax25_in.o
      238 sock_queue_rcv_skb 2.6.16/net/ax25/af_ax25.o
      238 sock_queue_rcv_skb 2.6.16/net/appletalk/ddp.o
      238 sock_queue_rcv_skb 2.6.16/drivers/net/pppoe.o
      276 sk_receive_skb 2.6.16/net/decnet/dn_nsp_in.o
      276 sk_receive_skb 2.6.16/net/dccp/ipv6.o
      276 sk_receive_skb 2.6.16/net/dccp/ipv4.o
      276 sk_receive_skb 2.6.16/net/dccp/dccp_ipv6.o
      276 sk_receive_skb 2.6.16/drivers/net/pppoe.o
      209 sk_dst_check 2.6.16/net/ipv6/ip6_output.o
      209 sk_dst_check 2.6.16/net/ipv4/udp.o
      209 sk_dst_check 2.6.16/net/decnet/dn_nsp_out.o
      Large inlines with multiple callers:
      Size  Uses Wasted Name and definition
      ===== ==== ====== ================================================
        238   21   4360 sock_queue_rcv_skb    include/net/sock.h
        109   10    801 sock_recv_timestamp   include/net/sock.h
        276    4    768 sk_receive_skb        include/net/sock.h
         94    8    518 __sk_dst_check        include/net/sock.h
        209    3    378 sk_dst_check  include/net/sock.h
        131    4    333 sk_setup_caps include/net/sock.h
        152    2    132 sk_stream_alloc_pskb  include/net/sock.h
        125    2    105 sk_stream_writequeue_purge    include/net/sock.h
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  21. 24 Mar, 2006 1 commit
  22. 20 Mar, 2006 1 commit
  23. 17 Mar, 2006 1 commit
  24. 02 Feb, 2006 1 commit
  25. 06 Jan, 2006 1 commit