Skip to content
  • Steven Rostedt's avatar
    sched: Use pushable_tasks to determine next highest prio · 5181f4a4
    Steven Rostedt authored
    
    
    Hillf Danton proposed a patch (see link) that cleaned up the
    sched_rt code that calculates the priority of the next highest priority
    task to be used in finding run queues to pull from.
    
    His patch removed the calculating of the next prio to just use the current
    prio when deteriming if we should examine a run queue to pull from. The problem
    with his patch was that it caused more false checks. Because we check a run
    queue for pushable tasks if the current priority of that run queue is higher
    in priority than the task about to run on our run queue. But after grabbing
    the locks and doing the real check, we find that there may not be a task
    that has a higher prio task to pull. Thus the locks were taken with nothing to
    do.
    
    I added some trace_printks() to record when and how many times the run queue
    locks were taken to check for pullable tasks, compared to how many times we
    pulled a task.
    
    With the current method, it was:
    
      3806 locks taken vs 2812 pulled tasks
    
    With Hillf's patch:
    
      6728 locks taken vs 2804 pulled tasks
    
    The number of times locks were taken to pull a task went up almost double with
    no more success rate.
    
    But his patch did get me thinking. When we look at the priority of the highest
    task to consider taking the locks to do a pull, a failure to pull can be one
    of the following: (in order of most likely)
    
     o RT task was pushed off already between the check and taking the lock
     o Waiting RT task can not be migrated
     o RT task's CPU affinity does not include the target run queue's CPU
     o RT task's priority changed between the check and taking the lock
    
    And with Hillf's patch, the thing that caused most of the failures, is
    the RT task to pull was not at the right priority to pull (not greater than
    the current RT task priority on the target run queue).
    
    Most of the above cases we can't help. But the current method does not check
    if the next highest prio RT task can be migrated or not, and if it can not,
    we still grab the locks to do the test (we don't find out about this fact until
    after we have the locks). I thought about this case, and realized that the
    pushable task plist that is maintained only holds RT tasks that can migrate.
    If we move the calculating of the next highest prio task from the inc/dec_rt_task()
    functions into the queuing of the pushable tasks, then we only measure the
    priorities of those tasks that we push, and we get this basically for free.
    
    Not only does this patch make the code a little more efficient, it cleans it
    up and makes it a little simpler.
    
    Thanks to Hillf Danton for inspiring me on this patch.
    
    Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Hillf Danton <dhillf@gmail.com>
    Cc: Gregory Haskins <ghaskins@novell.com>
    Link: http://lkml.kernel.org/r/BANLkTimQ67180HxCx5vgMqumqw1EkFh3qg@mail.gmail.com
    
    
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    5181f4a4