1. 16 Nov, 2008 10 commits
    • Eric Dumazet's avatar
      udp: Use hlist_nulls in UDP RCU code · 88ab1932
      Eric Dumazet authored
      This is a straightforward patch, using hlist_nulls infrastructure.
      RCUification already done on UDP two weeks ago.
      Using hlist_nulls permits us to avoid some memory barriers, both
      at lookup time and delete time.
      Patch is large because it adds new macros to include/net/sock.h.
      These macros will be used by TCP & DCCP in next patch.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Eric Dumazet's avatar
      rcu: Introduce hlist_nulls variant of hlist · bbaffaca
      Eric Dumazet authored
      hlist uses NULL value to finish a chain.
      hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker.
      This allows to store many different end markers, so that some RCU lockless
      algos (used in TCP/UDP stack for example) can save some memory barriers in
      fast paths.
      Two new files are added :
        - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant
        - mimics hlist part of include/linux/rculist.h, derived to hlist_nulls variant
         Only four helpers are declared for the moment :
           hlist_nulls_del_init_rcu(), hlist_nulls_del_rcu(),
           hlist_nulls_add_head_rcu() and hlist_nulls_for_each_entry_rcu()
      prefetches() were removed, since an end of list is not anymore NULL value.
      prefetches() could trigger useless (and possibly dangerous) memory transactions.
      Example of use (extracted from __udp4_lib_lookup())
      	struct sock *sk, *result;
              struct hlist_nulls_node *node;
              unsigned short hnum = ntohs(dport);
              unsigned int hash = udp_hashfn(net, hnum);
              struct udp_hslot *hslot = &udptable->hash[hash];
              int score, badness;
              result = NULL;
              badness = -1;
              sk_nulls_for_each_rcu(sk, node, &hslot->head) {
                      score = compute_score(sk, net, saddr, hnum, sport,
                                            daddr, dport, dif);
                      if (score > badness) {
                              result = sk;
                              badness = score;
               * if the nulls value we got at the end of this lookup is
               * not the expected one, we must restart lookup.
               * We probably met an item that was moved to another chain.
              if (get_nulls_value(node) != hash)
                      goto begin;
              if (result) {
                      if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt)))
                              result = NULL;
                      else if (unlikely(compute_score(result, net, saddr, hnum, sport,
                                        daddr, dport, dif) < badness)) {
                              goto begin;
              return result;
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Balazs Scheidler's avatar
      TPROXY: implemented IP_RECVORIGDSTADDR socket option · e8b2dfe9
      Balazs Scheidler authored
      In case UDP traffic is redirected to a local UDP socket,
      the originally addressed destination address/port
      cannot be recovered with the in-kernel tproxy.
      This patch adds an IP_RECVORIGDSTADDR sockopt that enables
      a IP_ORIGDSTADDR ancillary message in recvmsg(). This
      ancillary message contains the original destination address/port
      of the packet being received.
      Signed-off-by: default avatarBalazs Scheidler <bazsi@balabit.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Ben Greear's avatar
      ipv4: Fix ARP behavior with many mac-vlans · 8164f1b7
      Ben Greear authored
      Ben Greear wrote:
      > I have 500 mac-vlans on a system talking to 500 other
      > mac-vlans.  My problem is that the arp-table gets extremely
      > huge because every time an arp-request comes in on all mac-vlans,
      > a stale arp entry is added for each mac-vlan.  I have filtering
      > turned on, but that doesn't help because the neigh_event_ns call
      > below will cause a stale neighbor entry to be created regardless
      > of whether a replay will be sent or not.
      > Maybe the neigh_event code should be below the checks for dont_send,
      > and only create check neigh_event_ns if we are !dont_send?
      The attached patch makes it work much better for me.  The patch
      will cause the code to NOT create a stale neighbor entry if we
      are not going to respond to the ARP request.  The old code
      *would* create a stale entry even if we are not going to respond.
      Signed-off-by: default avatarBen Greear <greearb@candelatech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Alexander Duyck's avatar
      e1000e: enable ECC correction on 82571 silicon · 6ea7ae1d
      Alexander Duyck authored
      This change enables ECC correction for the packet buffer on all 82571
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Paulius Zaleckas's avatar
      phylib: make mdio-gpio work without OF (v4) · f004f3ea
      Paulius Zaleckas authored
      make mdio-gpio work with non OpenFirmware gpio implementation.
      Aditional changes to mdio-gpio:
      - use gpio_request() and gpio_free()
      - place irq[] array in struct mdio_gpio_info
      - add module description, author and license
      - add note about compiling this driver as module
      - rename mdc and mdio function (were ugly names)
      - change MII to MDIO in bus name
      - add __init __exit to module (un)loading functions
      - probe fails if no phys added to the bus
      - kzalloc bitbang with sizeof(*bitbang)
      Changes since v3:
      - keep bus naming "%x" to be compatible with existing drivers.
      Changes since v2:
      - more #ifdefs reduction
      - platform driver will be registered on OF platforms also
      - unified platform and OF bus_id to phy%i
      Changes since v1:
      - removed NO_IRQ
      - reduced #idefs
      Laurent, please test this driver under OF.
      Signed-off-by: default avatarPaulius Zaleckas <paulius.zaleckas@teltonika.lt>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • Paulius Zaleckas's avatar
    • David S. Miller's avatar
      dm9000: Fix build error. · 6817ba2c
      David S. Miller authored
      Reported by Stephen Rothwell:
      drivers/net/dm9000.c:1450: error: expected ')' before ';' token
      drivers/net/dm9000.c:1455: error: expected ';' before '}' token
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • David Brownell's avatar
      pegasus: minor resource shrinkage · cda2836d
      David Brownell authored
      Make pegasus driver not allocate a workqueue until the driver
      is bound to some device, which will need that workqueue if
      the device is brought up.  This conserves resources when the
      driver is linked but there's no pegasus device connected.
      Also shrink the runtime footprint a smidgeon by moving some
      init-only code into its proper section, and move an obnoxious
      (frequent and meaningless) message to be debug-only.
      Signed-off-by: default avatarDavid Brownell <dbrownell@users.sourceforge.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    • PJ Waskiewicz's avatar
      ixgbe: Fix usage of netif_*_all_queues() with netif_carrier_{off|on}() · 74ad0a54
      PJ Waskiewicz authored
      netif_carrier_off() is sufficient to stop Tx into the driver.  Stopping the Tx
      queues is redundant and unnecessary.  By the same token, netif_carrier_on()
      will be sufficient to re-enable Tx, so waking the queues is unnecessary.
      Signed-off-by: default avatarPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  2. 14 Nov, 2008 1 commit
    • Eric Dumazet's avatar
      net: speedup dst_release() · ef711cf1
      Eric Dumazet authored
      During tbench/oprofile sessions, I found that dst_release() was in third position.
      CPU: Core 2, speed 2999.68 MHz (estimated)
      Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
      samples  %        symbol name
      483726    9.0185  __copy_user_zeroing_intel
      191466    3.5697  __copy_user_intel
      185475    3.4580  dst_release
      175114    3.2648  ip_queue_xmit
      153447    2.8608  tcp_sendmsg
      108775    2.0280  tcp_recvmsg
      102659    1.9140  sysenter_past_esp
      101450    1.8914  tcp_current_mss
      95067     1.7724  __copy_from_user_ll
      86531     1.6133  tcp_transmit_skb
      Of course, all CPUS fight on the dst_entry associated with 
      Instead of first checking the refcount value, then decrement it,
      we use atomic_dec_return() to help CPU to make the right memory transaction
      (ie getting the cache line in exclusive mode)
      dst_release() is now at the fifth position, and tbench a litle bit faster ;)
      CPU: Core 2, speed 3000.1 MHz (estimated)
      Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
      samples  %        symbol name
      647107    8.8072  __copy_user_zeroing_intel
      258840    3.5229  ip_queue_xmit
      258302    3.5155  __copy_user_intel
      209629    2.8531  tcp_sendmsg
      165632    2.2543  dst_release
      149232    2.0311  tcp_current_mss
      147821    2.0119  tcp_recvmsg
      137893    1.8767  sysenter_past_esp
      127473    1.7349  __copy_from_user_ll
      121308    1.6510  ip_finish_output
      118510    1.6129  tcp_transmit_skb
      109295    1.4875  tcp_v4_rcv
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  3. 13 Nov, 2008 9 commits
  4. 12 Nov, 2008 12 commits
  5. 11 Nov, 2008 8 commits