Skip to content
  • Vlastimil Babka's avatar
    mm: set page->pfmemalloc in prep_new_page() · 75379191
    Vlastimil Babka authored
    The possibility of replacing the numerous parameters of alloc_pages*
    functions with a single structure has been discussed when Minchan proposed
    to expand the x86 kernel stack [1].  This series implements the change,
    along with few more cleanups/microoptimizations.
    
    The series is based on next-20150108 and I used gcc 4.8.3 20140627 on
    openSUSE 13.2 for compiling.  Config includess NUMA and COMPACTION.
    
    The core change is the introduction of a new struct alloc_context, which looks
    like this:
    
    struct alloc_context {
            struct zonelist *zonelist;
            nodemask_t *nodemask;
            struct zone *preferred_zone;
            int classzone_idx;
            int migratetype;
            enum zone_type high_zoneidx;
    };
    
    All the contents is mostly constant, except that __alloc_pages_slowpath()
    changes preferred_zone, classzone_idx and potentially zonelist.  But
    that's not a problem in case control returns to retry_cpuset: in
    __alloc_pages_nodemask(), those will be reset to initial values again
    (although it's a bit subtle).  On the other hand, gfp_flags and alloc_info
    mutate so much that it doesn't make sense to put them into alloc_context.
    Still, the result is one parameter instead of up to 7.  This is all in
    Patch 2.
    
    Patch 3 is a step to expand alloc_context usage out of page_alloc.c
    itself.  The function try_to_compact_pages() can also much benefit from
    the parameter reduction, but it means the struct definition has to be
    moved to a shared header.
    
    Patch 1 should IMHO be included even if the rest is deemed not useful
    enough.  It improves maintainability and also has some code/stack
    reduction.  Patch 4 is OTOH a tiny optimization.
    
    Overall bloat-o-meter results:
    
    add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-460 (-460)
    function                                     old     new   delta
    nr_free_zone_pages                           129     115     -14
    __alloc_pages_direct_compact                 329     256     -73
    get_page_from_freelist                      2670    2576     -94
    __alloc_pages_nodemask                      2564    2285    -279
    try_to_compact_pages                         582     579      -3
    
    Overall stack sizes per ./scripts/checkstack.pl:
    
                              old   new delta
    get_page_from_freelist:   184   184     0
    __alloc_pages_nodemask    248   200   -48
    __alloc_pages_direct_c     40     -   -40
    try_to_compact_pages       72    72     0
                                          -88
    
    [1] http://marc.info/?l=linux-mm&m=140142462528257&w=2
    
    This patch (of 4):
    
    prep_new_page() sets almost everything in the struct page of the page
    being allocated, except page->pfmemalloc.  This is not obvious and has at
    least once led to a bug where page->pfmemalloc was forgotten to be set
    correctly, see commit 8fb74b9f
    
     ("mm: compaction: partially revert
    capture of suitable high-order page").
    
    This patch moves the pfmemalloc setting to prep_new_page(), which means it
    needs to gain alloc_flags parameter.  The call to prep_new_page is moved
    from buffered_rmqueue() to get_page_from_freelist(), which also leads to
    simpler code.  An obsolete comment for buffered_rmqueue() is replaced.
    
    In addition to better maintainability there is a small reduction of code
    and stack usage for get_page_from_freelist(), which inlines the other
    functions involved.
    
    add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-145 (-145)
    function                                     old     new   delta
    get_page_from_freelist                      2670    2525    -145
    
    Stack usage is reduced from 184 to 168 bytes.
    
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
    Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    75379191