Skip to content
  • Mel Gorman's avatar
    Revert "mm: remove __GFP_NO_KSWAPD" · 82b212f4
    Mel Gorman authored
    
    
    With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
    based on failures" reverted, Zdenek Kabelac reported the following
    
      Hmm,  so it's just took longer to hit the problem and observe
      kswapd0 spinning on my CPU again - it's not as endless like before -
      but still it easily eats minutes - it helps to	turn off  Firefox
      or TB  (memory hungry apps) so kswapd0 stops soon - and restart
      those apps again.  (And I still have like >1GB of cached memory)
    
      kswapd0         R  running task        0    30      2 0x00000000
      Call Trace:
        preempt_schedule+0x42/0x60
        _raw_spin_unlock+0x55/0x60
        put_super+0x31/0x40
        drop_super+0x22/0x30
        prune_super+0x149/0x1b0
        shrink_slab+0xba/0x510
    
    The sysrq+m indicates the system has no swap so it'll never reclaim
    anonymous pages as part of reclaim/compaction.  That is one part of the
    problem but not the root cause as file-backed pages could also be
    reclaimed.
    
    The likely underlying problem is that kswapd is woken up or kept awake
    for each THP allocation request in the page allocator slow path.
    
    If compaction fails for the requesting process then compaction will be
    deferred for a time and direct reclaim is avoided.  However, if there
    are a storm of THP requests that are simply rejected, it will still be
    the the case that kswapd is awake for a prolonged period of time as
    pgdat->kswapd_max_order is updated each time.  This is noticed by the
    main kswapd() loop and it will not call kswapd_try_to_sleep().  Instead
    it will loopp, shrinking a small number of pages and calling
    shrink_slab() on each iteration.
    
    The temptation is to supply a patch that checks if kswapd was woken for
    THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
    backed up by proper testing.  As 3.7 is very close to release and this
    is not a bug we should release with, a safer path is to revert "mm:
    remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
    out the balance_pgdat() logic in general.
    
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Cc: Zdenek Kabelac <zkabelac@redhat.com>
    Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
    Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
    Cc: Jiri Slaby <jirislaby@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    82b212f4