Skip to content
  • Rik van Riel's avatar
    sched/numa: Do not move past the balance point if unbalanced · 095bebf6
    Rik van Riel authored
    There is a subtle interaction between the logic introduced in commit
    e63da036
    
     ("sched/numa: Allow task switch if load imbalance improves"),
    the way the load balancer counts the load on each NUMA node, and the way
    NUMA hinting faults are done.
    
    Specifically, the load balancer only counts currently running tasks
    in the load, while NUMA hinting faults may cause tasks to stop, if
    the page is locked by another task.
    
    This could cause all of the threads of a large single instance workload,
    like SPECjbb2005, to migrate to the same NUMA node. This was possible
    because occasionally they all fault on the same few pages, and only one
    of the threads remains runnable. That thread can move to the process's
    preferred NUMA node without making the imbalance worse, because nothing
    else is running at that time.
    
    The fix is to check the direction of the net moving of load, and to
    refuse a NUMA move if it would cause the system to move past the point
    of balance.  In an unbalanced state, only moves that bring us closer
    to the balance point are allowed.
    
    Signed-off-by: default avatarRik van Riel <riel@redhat.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: mgorman@suse.de
    Link: http://lkml.kernel.org/r/20150203165648.0e9ac692@annuminas.surriel.com
    
    
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    095bebf6