Skip to content
  • Johannes Weiner's avatar
    mm: memcontrol: fold mem_cgroup_do_charge() · 6539cc05
    Johannes Weiner authored
    
    
    These patches rework memcg charge lifetime to integrate more naturally
    with the lifetime of user pages.  This drastically simplifies the code
    and reduces charging and uncharging overhead.  The most expensive part
    of charging and uncharging is the page_cgroup bit spinlock, which is
    removed entirely after this series.
    
    Here are the top-10 profile entries of a stress test that reads a 128G
    sparse file on a freshly booted box, without even a dedicated cgroup
    (i.e. executing in the root memcg).  Before:
    
        15.36%              cat  [kernel.kallsyms]   [k] copy_user_generic_string
        13.31%              cat  [kernel.kallsyms]   [k] memset
        11.48%              cat  [kernel.kallsyms]   [k] do_mpage_readpage
         4.23%              cat  [kernel.kallsyms]   [k] get_page_from_freelist
         2.38%              cat  [kernel.kallsyms]   [k] put_page
         2.32%              cat  [kernel.kallsyms]   [k] __mem_cgroup_commit_charge
         2.18%          kswapd0  [kernel.kallsyms]   [k] __mem_cgroup_uncharge_common
         1.92%          kswapd0  [kernel.kallsyms]   [k] shrink_page_list
         1.86%              cat  [kernel.kallsyms]   [k] __radix_tree_lookup
         1.62%              cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn
    
    After:
    
        15.67%           cat  [kernel.kallsyms]   [k] copy_user_generic_string
        13.48%           cat  [kernel.kallsyms]   [k] memset
        11.42%           cat  [kernel.kallsyms]   [k] do_mpage_readpage
         3.98%           cat  [kernel.kallsyms]   [k] get_page_from_freelist
         2.46%           cat  [kernel.kallsyms]   [k] put_page
         2.13%       kswapd0  [kernel.kallsyms]   [k] shrink_page_list
         1.88%           cat  [kernel.kallsyms]   [k] __radix_tree_lookup
         1.67%           cat  [kernel.kallsyms]   [k] __pagevec_lru_add_fn
         1.39%       kswapd0  [kernel.kallsyms]   [k] free_pcppages_bulk
         1.30%           cat  [kernel.kallsyms]   [k] kfree
    
    As you can see, the memcg footprint has shrunk quite a bit.
    
       text    data     bss     dec     hex filename
      37970    9892     400   48262    bc86 mm/memcontrol.o.old
      35239    9892     400   45531    b1db mm/memcontrol.o
    
    This patch (of 13):
    
    This function was split out because mem_cgroup_try_charge() got too big.
    But having essentially one sequence of operations arbitrarily split in
    half is not good for reworking the code.  Fold it back in.
    
    Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vladimir Davydov <vdavydov@parallels.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6539cc05