• Wolfgang Wander's avatar
    [PATCH] Avoiding mmap fragmentation · 1363c3cd
    Wolfgang Wander authored
    
    
    Ingo recently introduced a great speedup for allocating new mmaps using the
    free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
    causes huge performance increases in thread creation.
    
    The downside of this patch is that it does lead to fragmentation in the
    mmap-ed areas (visible via /proc/self/maps), such that some applications
    that work fine under 2.4 kernels quickly run out of memory on any 2.6
    kernel.
    
    The problem is twofold:
    
      1) the free_area_cache is used to continue a search for memory where
         the last search ended.  Before the change new areas were always
         searched from the base address on.
    
         So now new small areas are cluttering holes of all sizes
         throughout the whole mmap-able region whereas before small holes
         tended to close holes near the base leaving holes far from the base
         large and available for larger requests.
    
      2) the free_area_cache also is set to the location of the last
         munmap-ed area so in scenarios where we allocate e.g.  five regions of
         1K each, then free regions 4 2 3 in this order the next request for 1K
         will be placed in the position of the old region 3, whereas before we
         appended it to the still active region 1, placing it at the location
         of the old region 2.  Before we had 1 free region of 2K, now we only
         get two free regions of 1K -> fragmentation.
    
    The patch addresses thes issues by introducing yet another cache descriptor
    cached_hole_size that contains the largest known hole size below the
    current free_area_cache.  If a new request comes in the size is compared
    against the cached_hole_size and if the request can be filled with a hole
    below free_area_cache the search is started from the base instead.
    
    The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
    (earlier posted) leakme.c test program terminates after 50000+ iterations
    with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
    (as expected) with thread creation, Ingo's test_str02 with 20000 threads
    requires 0.7s system time.
    
    Taking out Ingo's patch (un-patch available per request) by basically
    deleting all mentions of free_area_cache from the kernel and starting the
    search for new memory always at the respective bases we observe: leakme
    terminates successfully with 11 distinctive hardly fragmented areas in
    /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
    time for Ingo's test_str02 with 20000 threads.
    
    Now - drumroll ;-) the appended patch works fine with leakme: it ends with
    only 7 distinct areas in /proc/self/maps and also thread creation seems
    sufficiently fast with 0.71s for 20000 threads.
    Signed-off-by: default avatarWolfgang Wander <wwc@rentec.com>
    Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
    Signed-off-by: default avatarKen Chen <kenneth.w.chen@intel.com>
    Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    1363c3cd
sched.h 39.5 KB