Skip to content
  • Vlastimil Babka's avatar
    mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations · 25160354
    Vlastimil Babka authored
    After the previous patch, we can distinguish costly allocations that
    should be really lightweight, such as THP page faults, with
    __GFP_NORETRY.  This means we don't need to recognize khugepaged
    allocations via PF_KTHREAD anymore.  We can also change THP page faults
    in areas where madvise(MADV_HUGEPAGE) was used to try as hard as
    khugepaged, as the process has indicated that it benefits from THP's and
    is willing to pay some initial latency costs.
    
    We can also make the flags handling less cryptic by distinguishing
    GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
    GFP_TRANSHUGE (only direct reclaim, khugepaged default).  Adding
    __GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed.
    
    The patch effectively changes the current GFP_TRANSHUGE users as
    follows:
    
    * get_huge_zero_page() - the zero page lifetime should be relatively
      long and it's shared by multiple users, so it's worth spending some
      effort on it.  We use GFP_TRANSHUGE, and __GFP_NORETRY is not added.
      This also restores direct reclaim to this allocation, which was
      unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag
      by default to madvise and add a stall-free defrag option")
    
    * alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency
      is not an issue.  So if khugepaged "defrag" is enabled (the default), do
      reclaim via GFP_TRANSHUGE without __GFP_NORETRY.  We can remove the
      PF_KTHREAD check from page alloc.
    
      As a side-effect, khugepaged will now no longer check if the initial
      compaction was deferred or contended.  This is OK, as khugepaged sleep
      times between collapsion attempts are long enough to prevent noticeable
      disruption, so we should allow it to spend some effort.
    
    * migrate_misplaced_transhuge_page() - already was masking out
      __GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is
      equivalent.
    
    * alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise)
      are now allocating without __GFP_NORETRY.  Other vma's keep using
      __GFP_NORETRY if direct reclaim/compaction is at all allowed (by default
      it's allowed only for madvised vma's).  The rest is conversion to
      GFP_TRANSHUGE(_LIGHT).
    
    [mhocko@suse.com: suggested GFP_TRANSHUGE_LIGHT]
    Link: http://lkml.kernel.org/r/20160721073614.24395-7-vbabka@suse.cz
    
    
    Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    25160354