1. 10 May, 2007 3 commits
    • Andrew Morton's avatar
      add upper-32-bits macro · 218e180e
      Andrew Morton authored
      
      
      We keep on getting "right shift count >= width of type" warnings when doing
      things like
      
      	sector_t s;
      
      	x = s >> 56;
      
      because with CONFIG_LBD=n, s is only 32-bit.  Similar problems can occur with
      dma_addr_t's.
      
      So add a simple wrapper function which code can use to avoid this warning.
      The above example would become
      
      	x = upper_32_bits(s) >> 24;
      
      The first user is in fact AFS.
      
      Cc: James Bottomley <James.Bottomley@SteelEye.com>
      Cc: "Cameron, Steve" <Steve.Cameron@hp.com>
      Cc: "Miller, Mike (OS Dev)" <Mike.Miller@hp.com>
      Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      218e180e
    • Christoph Lameter's avatar
      slub: support concurrent local and remote frees and allocs on a slab · 894b8788
      Christoph Lameter authored
      
      
      Avoid atomic overhead in slab_alloc and slab_free
      
      SLUB needs to use the slab_lock for the per cpu slabs to synchronize with
      potential kfree operations.  This patch avoids that need by moving all free
      objects onto a lockless_freelist.  The regular freelist continues to exist
      and will be used to free objects.  So while we consume the
      lockless_freelist the regular freelist may build up objects.
      
      If we are out of objects on the lockless_freelist then we may check the
      regular freelist.  If it has objects then we move those over to the
      lockless_freelist and do this again.  There is a significant savings in
      terms of atomic operations that have to be performed.
      
      We can even free directly to the lockless_freelist if we know that we are
      running on the same processor.  So this speeds up short lived objects.
      They may be allocated and freed without taking the slab_lock.  This is
      particular good for netperf.
      
      In order to maximize the effect of the new faster hotpath we extract the
      hottest performance pieces into inlined functions.  These are then inlined
      into kmem_cache_alloc and kmem_cache_free.  So hotpath allocation and
      freeing no longer requires a subroutine call within SLUB.
      
      [I am not sure that it is worth doing this because it changes the easy to
      read structure of slub just to reduce atomic ops.  However, there is
      someone out there with a benchmark on 4 way and 8 way processor systems
      that seems to show a 5% regression vs.  Slab.  Seems that the regression is
      due to increased atomic operations use vs.  SLAB in SLUB).  I wonder if
      this is applicable or discernable at all in a real workload?]
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      894b8788
    • Mathieu Desnoyers's avatar
  2. 09 May, 2007 37 commits