Skip to content
  • Petr Holasek's avatar
    ksm: allow trees per NUMA node · 90bd6fd3
    Petr Holasek authored
    Here's a KSM series, based on mmotm 2013-01-23-17-04: starting with
    Petr's v7 "KSM: numa awareness sysfs knob"; then fixing the two issues
    we had with that, fully enabling KSM page migration on the way.
    
    (A different kind of KSM/NUMA issue which I've certainly not begun to
    address here: when KSM pages are unmerged, there's usually no sense in
    preferring to allocate the new pages local to the caller's node.)
    
    This patch:
    
    Introduces new sysfs boolean knob /sys/kernel/mm/ksm/merge_across_nodes
    which control merging pages across different numa nodes.  When it is set
    to zero only pages from the same node are merged, otherwise pages from
    all nodes can be merged together (default behavior).
    
    Typical use-case could be a lot of KVM guests on NUMA machine and cpus
    from more distant nodes would have significant increase of access
    latency to the merged ksm page.  Sysfs knob was choosen for higher
    variability when some users still prefers higher amount of saved
    physical memory regardless of access latency.
    
    Every numa node has its own stable & unstable trees because of faster
    searching and inserting.  Changing of merge_across_nodes value is
    possible only when there are not any ksm shared pages in system.
    
    I've tested this patch on numa machines with 2, 4 and 8 nodes and
    measured speed of memory access inside of KVM guests with memory pinned
    to one of nodes with this benchmark:
    
      http://pholasek.fedorapeople.org/alloc_pg.c
    
    Population standard deviations of access times in percentage of average
    were following:
    
    merge_across_nodes=1
    2 nodes 1.4%
    4 nodes 1.6%
    8 nodes	1.7%
    
    merge_across_nodes=0
    2 nodes	1%
    4 nodes	0.32%
    8 nodes	0.018%
    
    RFC: https://lkml.org/lkml/2011/11/30/91
    v1: https://lkml.org/lkml/2012/1/23/46
    v2: https://lkml.org/lkml/2012/6/29/105
    v3: https://lkml.org/lkml/2012/9/14/550
    v4: https://lkml.org/lkml/2012/9/23/137
    v5: https://lkml.org/lkml/2012/12/10/540
    v6: https://lkml.org/lkml/2012/12/23/154
    v7: https://lkml.org/lkml/2012/12/27/225
    
    
    
    Hugh notes that this patch brings two problems, whose solution needs
    further support in mm/ksm.c, which follows in subsequent patches:
    
    1) switching merge_across_nodes after running KSM is liable to oops
       on stale nodes still left over from the previous stable tree;
    
    2) memory hotremove may migrate KSM pages, but there is no provision
       here for !merge_across_nodes to migrate nodes to the proper tree.
    
    Signed-off-by: default avatarPetr Holasek <pholasek@redhat.com>
    Signed-off-by: default avatarHugh Dickins <hughd@google.com>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Izik Eidus <izik.eidus@ravellosystems.com>
    Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
    Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    90bd6fd3