Skip to content
  • Hugh Dickins's avatar
    ksm: make KSM page migration possible · c8d6553b
    Hugh Dickins authored
    
    
    KSM page migration is already supported in the case of memory hotremove,
    which takes the ksm_thread_mutex across all its migrations to keep life
    simple.
    
    But the new KSM NUMA merge_across_nodes knob introduces a problem, when
    it's set to non-default 0: if a KSM page is migrated to a different NUMA
    node, how do we migrate its stable node to the right tree?  And what if
    that collides with an existing stable node?
    
    So far there's no provision for that, and this patch does not attempt to
    deal with it either.  But how will I test a solution, when I don't know
    how to hotremove memory?  The best answer is to enable KSM page migration
    in all cases now, and test more common cases.  With THP and compaction
    added since KSM came in, page migration is now mainstream, and it's a
    shame that a KSM page can frustrate freeing a page block.
    
    Without worrying about merge_across_nodes 0 for now, this patch gets KSM
    page migration working reliably for default merge_across_nodes 1 (but
    leave the patch enabling it until near the end of the series).
    
    It's much simpler than I'd originally imagined, and does not require an
    additional tier of locking: page migration relies on the page lock, KSM
    page reclaim relies on the page lock, the page lock is enough for KSM page
    migration too.
    
    Almost all the care has to be in get_ksm_page(): that's the function which
    worries about when a stable node is stale and should be freed, now it also
    has to worry about the KSM page being migrated.
    
    The only new overhead is an additional put/get/lock/unlock_page when
    stable_tree_search() arrives at a matching node: to make sure migration
    respects the raised page count, and so does not migrate the page while
    we're busy with it here.  That's probably avoidable, either by changing
    internal interfaces from using kpage to stable_node, or by moving the
    ksm_migrate_page() callsite into a page_freeze_refs() section (even if not
    swapcache); but this works well, I've no urge to pull it apart now.
    
    (Descents of the stable tree may pass through nodes whose KSM pages are
    under migration: being unlocked, the raised page count does not prevent
    that, nor need it: it's safe to memcmp against either old or new page.)
    
    You might worry about mremap, and whether page migration's rmap_walk to
    remove migration entries will find all the KSM locations where it inserted
    earlier: that should already be handled, by the satisfyingly heavy hammer
    of move_vma()'s call to ksm_madvise(,,,MADV_UNMERGEABLE,).
    
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Cc: Rik van Riel <riel@redhat.com>
    Cc: Petr Holasek <pholasek@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Izik Eidus <izik.eidus@ravellosystems.com>
    Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
    Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    c8d6553b