1. 15 Jul, 2016 1 commit
  2. 31 Mar, 2016 1 commit
  3. 07 Dec, 2015 1 commit
    • Paul E. McKenney's avatar
      rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}() · 7c9906ca
      Paul E. McKenney authored
      This commit replaces a local_irq_save()/local_irq_restore() pair with
      a lockdep assertion that interrupts are already disabled.  This should
      remove the corresponding overhead from the interrupt entry/exit fastpaths.
      
      This change was inspired by the fact that Iftekhar Ahmed's mutation
      testing showed that removing rcu_irq_enter()'s call to local_ird_restore()
      had no effect, which might indicate that interrupts were always enabled
      anyway.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7c9906ca
  4. 04 Dec, 2015 1 commit
    • Paul E. McKenney's avatar
      rcu: Stop disabling interrupts in scheduler fastpaths · 46a5d164
      Paul E. McKenney authored
      We need the scheduler's fastpaths to be, well, fast, and unnecessarily
      disabling and re-enabling interrupts is not necessarily consistent with
      this goal.  Especially given that there are regions of the scheduler that
      already have interrupts disabled.
      
      This commit therefore moves the call to rcu_note_context_switch()
      to one of the interrupts-disabled regions of the scheduler, and
      removes the now-redundant disabling and re-enabling of interrupts from
      rcu_note_context_switch() and the functions it calls.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Shift rcu_note_context_switch() to avoid deadlock, as suggested
        by Peter Zijlstra. ]
      46a5d164
  5. 06 Oct, 2015 1 commit
  6. 22 Jul, 2015 1 commit
  7. 27 May, 2015 2 commits
  8. 22 Apr, 2015 1 commit
    • Thomas Gleixner's avatar
      tick: Nohz: Rework next timer evaluation · c1ad348b
      Thomas Gleixner authored
      The evaluation of the next timer in the nohz code is based on jiffies
      while all the tick internals are nano seconds based. We have also to
      convert hrtimer nanoseconds to jiffies in the !highres case. That's
      just wrong and introduces interesting corner cases.
      
      Turn it around and convert the next timer wheel timer expiry and the
      rcu event to clock monotonic and base all calculations on
      nanoseconds. That identifies the case where no timer is pending
      clearly with an absolute expiry value of KTIME_MAX.
      
      Makes the code more readable and gets rid of the jiffies magic in the
      nohz code.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      c1ad348b
  9. 16 Jan, 2015 1 commit
    • Paul E. McKenney's avatar
      rcu: Make cond_resched_rcu_qs() apply to normal RCU flavors · 5cd37193
      Paul E. McKenney authored
      Although cond_resched_rcu_qs() only applies to TASKS_RCU, it is used
      in places where it would be useful for it to apply to the normal RCU
      flavors, rcu_preempt, rcu_sched, and rcu_bh.  This is especially the
      case for workloads that aggressively overload the system, particularly
      those that generate large numbers of RCU updates on systems running
      NO_HZ_FULL CPUs.  This commit therefore communicates quiescent states
      from cond_resched_rcu_qs() to the normal RCU flavors.
      
      Note that it is unfortunately necessary to leave the old ->passed_quiesce
      mechanism in place to allow quiescent states that apply to only one
      flavor to be recorded.  (Yes, we could decrement ->rcu_qs_ctr_snap in
      that case, but that is not so good for debugging of RCU internals.)
      In addition, if one of the RCU flavor's grace period has stalled, this
      will invoke rcu_momentary_dyntick_idle(), resulting in a heavy-weight
      quiescent state visible from other CPUs.
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      [ paulmck: Merge commit from Sasha Levin fixing a bug where __this_cpu()
        was used in preemptible code. ]
      5cd37193
  10. 10 Jan, 2015 2 commits
    • Paul E. McKenney's avatar
      rcutorture: Check from beginning to end of grace period · 917963d0
      Paul E. McKenney authored
      Currently, rcutorture's Reader Batch checks measure from the end of
      the previous grace period to the end of the current one.  This commit
      tightens up these checks by measuring from the start and end of the same
      grace period.  This involves adding rcu_batches_started() and friends
      corresponding to the existing rcu_batches_completed() and friends.
      
      We leave SRCU alone for the moment, as it does not yet have a way of
      tracking both ends of its grace periods.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      917963d0
    • Paul E. McKenney's avatar
      rcu: Make _batches_completed() functions return unsigned long · 9733e4f0
      Paul E. McKenney authored
      Long ago, the various ->completed fields were of type long, but now are
      unsigned long due to signed-integer-overflow concerns.  However, the
      various _batches_completed() functions remained of type long, even though
      their only purpose in life is to return the corresponding ->completed
      field.  This patch cleans this up by changing these functions' return
      types to unsigned long.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      9733e4f0
  11. 03 Nov, 2014 2 commits
  12. 14 May, 2014 1 commit
  13. 20 Mar, 2014 1 commit
    • Paul E. McKenney's avatar
      rcu: Provide grace-period piggybacking API · 765a3f4f
      Paul E. McKenney authored
      The following pattern is currently not well supported by RCU:
      
      1.	Make data element inaccessible to RCU readers.
      
      2.	Do work that probably lasts for more than one grace period.
      
      3.	Do something to make sure RCU readers in flight before #1 above
      	have completed.
      
      Here are some things that could currently be done:
      
      a.	Do a synchronize_rcu() unconditionally at either #1 or #3 above.
      	This works, but imposes needless work and latency.
      
      b.	Post an RCU callback at #1 above that does a wakeup, then
      	wait for the wakeup at #3.  This works well, but likely results
      	in an extra unneeded grace period.  Open-coding this is also
      	a bit more semi-tricky code than would be good.
      
      This commit therefore adds get_state_synchronize_rcu() and
      cond_synchronize_rcu() APIs.  Call get_state_synchronize_rcu() at #1
      above and pass its return value to cond_synchronize_rcu() at #3 above.
      This results in a call to synchronize_rcu() if no grace period has
      elapsed between #1 and #3, but requires only a load, comparison, and
      memory barrier if a full grace period did elapse.
      Requested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      765a3f4f
  14. 17 Feb, 2014 2 commits
  15. 12 Dec, 2013 1 commit
  16. 25 Sep, 2013 2 commits
  17. 10 Jun, 2013 2 commits
  18. 06 Jun, 2012 1 commit
    • Paul E. McKenney's avatar
      rcu: Precompute RCU_FAST_NO_HZ timer offsets · aa9b1630
      Paul E. McKenney authored
      When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick()
      calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the
      next wakeup time based on the timer wheels.  Only later, when actually
      entering the idle loop, rcu_prepare_for_idle() will be invoked.  In some
      cases, rcu_prepare_for_idle() will post timers to wake the CPU back up.
      But all for naught: The next wakeup time for the CPU has already been
      computed, and posting a timer afterwards does not force that wakeup
      time to be recomputed.  This means that rcu_prepare_for_idle()'s have
      no effect.
      
      This is not a problem on a busy system because something else will wake
      up the CPU soon enough.  However, on lightly loaded systems, the CPU
      might stay asleep for a considerable length of time.  If that CPU has
      a callback that the rest of the system is waiting on, the system might
      run very slowly or (in theory) even hang.
      
      This commit avoids this problem by having rcu_needs_cpu() give
      tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU
      to wake back up, which tick_nohz_stop_sched_tick() takes into account
      when programming the CPU's wakeup time.  An alternative approach is
      for rcu_prepare_for_idle() to use hrtimers instead of normal timers,
      but timers are much more efficient than are hrtimers for frequently
      and repeatedly posting and cancelling a given timer, which is exactly
      what RCU_FAST_NO_HZ does.
      Reported-by: default avatarPascal Chapperon <pascal.chapperon@wanadoo.fr>
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: default avatarPascal Chapperon <pascal.chapperon@wanadoo.fr>
      aa9b1630
  19. 02 May, 2012 1 commit
    • Paul E. McKenney's avatar
      rcu: Make exit_rcu() more precise and consolidate · 9dd8fb16
      Paul E. McKenney authored
      When running preemptible RCU, if a task exits in an RCU read-side
      critical section having blocked within that same RCU read-side critical
      section, the task must be removed from the list of tasks blocking a
      grace period (perhaps the current grace period, perhaps the next grace
      period, depending on timing).  The exit() path invokes exit_rcu() to
      do this cleanup.
      
      However, the current implementation of exit_rcu() needlessly does the
      cleanup even if the task did not block within the current RCU read-side
      critical section, which wastes time and needlessly increases the size
      of the state space.  Fix this by only doing the cleanup if the current
      task is actually on the list of tasks blocking some grace period.
      
      While we are at it, consolidate the two identical exit_rcu() functions
      into a single function.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      
      Conflicts:
      
      	kernel/rcupdate.c
      9dd8fb16
  20. 24 Apr, 2012 1 commit
    • Paul E. McKenney's avatar
      rcu: Document why rcu_blocking_is_gp() is safe · 6d813391
      Paul E. McKenney authored
      The rcu_blocking_is_gp() function tests to see if there is only one
      online CPU, and if so, synchronize_sched() and friends become no-ops.
      However, for larger systems, num_online_cpus() scans a large vector,
      and might be preempted while doing so.  While preempted, any number
      of CPUs might come online and go offline, potentially resulting in
      num_online_cpus() returning 1 when there never had only been one
      CPU online.  This could result in a too-short RCU grace period, which
      could in turn result in total failure, except that the only way that
      the grace period is too short is if there is an RCU read-side critical
      section spanning it.  For RCU-sched and RCU-bh (which are the only
      cases using rcu_blocking_is_gp()), RCU read-side critical sections
      have either preemption or bh disabled, which prevents CPUs from going
      offline.  This in turn prevents actual failures from occurring.
      
      This commit therefore adds a large block comment to rcu_blocking_is_gp()
      documenting why it is safe.  This commit also moves rcu_blocking_is_gp()
      into kernel/rcutree.c, which should help prevent unwary developers from
      mistaking it for a generally useful function.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6d813391
  21. 21 Feb, 2012 3 commits
  22. 28 Sep, 2011 1 commit
  23. 06 May, 2011 2 commits
  24. 29 Nov, 2010 1 commit
  25. 17 Nov, 2010 1 commit
    • Paul E. McKenney's avatar
      rcu: move TINY_RCU from softirq to kthread · b2c0710c
      Paul E. McKenney authored
      If RCU priority boosting is to be meaningful, callback invocation must
      be boosted in addition to preempted RCU readers.  Otherwise, in presence
      of CPU real-time threads, the grace period ends, but the callbacks don't
      get invoked.  If the callbacks don't get invoked, the associated memory
      doesn't get freed, so the system is still subject to OOM.
      
      But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
      moves the callback invocations to a kthread, which can be boosted easily.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b2c0710c
  26. 20 Aug, 2010 4 commits
    • Paul E. McKenney's avatar
      rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU · 7b0b759b
      Paul E. McKenney authored
      The CONFIG_PREEMPT_RCU kernel configuration parameter was recently
      re-introduced, but as an indication of the type of RCU (preemptible
      vs. non-preemptible) instead of as selecting a given implementation.
      This commit uses CONFIG_PREEMPT_RCU to combine duplicate code
      from include/linux/rcutiny.h and include/linux/rcutree.h into
      include/linux/rcupdate.h.  This commit also combines a few other pieces
      of duplicate code that have accumulated.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      7b0b759b
    • Paul E. McKenney's avatar
      rcu: repair code-duplication FIXMEs · a3dc3fb1
      Paul E. McKenney authored
      Combine the duplicate definitions of ULONG_CMP_GE(), ULONG_CMP_LT(),
      and rcu_preempt_depth() into include/linux/rcupdate.h.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a3dc3fb1
    • Paul E. McKenney's avatar
      rcu: permit suppressing current grace period's CPU stall warnings · 53d84e00
      Paul E. McKenney authored
      When using a kernel debugger, a long sojourn in the debugger can get
      you lots of RCU CPU stall warnings once you resume.  This might not be
      helpful, especially if you are using the system console.  This patch
      therefore allows RCU CPU stall warnings to be suppressed, but only for
      the duration of the current set of grace periods.
      
      This differs from Jason's original patch in that it adds support for
      tiny RCU and preemptible RCU, and uses a slightly different method for
      suppressing the RCU CPU stall warning messages.
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarJason Wessel <jason.wessel@windriver.com>
      53d84e00
    • Paul E. McKenney's avatar
      rcu: Add a TINY_PREEMPT_RCU · a57eb940
      Paul E. McKenney authored
      Implement a small-memory-footprint uniprocessor-only implementation of
      preemptible RCU.  This implementation uses but a single blocked-tasks
      list rather than the combinatorial number used per leaf rcu_node by
      TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
      processing.  This version also takes advantage of uniprocessor execution
      to accelerate grace periods in the case where there are no readers.
      
      The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.
      
      This implementation is a step towards having RCU implementation driven
      off of the SMP and PREEMPT kernel configuration variables, which can
      happen once this implementation has accumulated sufficient experience.
      
      Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
      suggested by Steve Rostedt in order to avoid the compiler-reordering
      issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).
      
      As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
      savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
      workloads, CONFIG_TINY_RCU is even better.
      
      	CONFIG_TREE_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   6170	    825	     28	   7023	   kernel/rcutree.o
      				   ----
      				   7026    Total
      
      	CONFIG_TINY_PREEMPT_RCU
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	   2081	     81	      8	   2170	   kernel/rcutiny.o
      				   ----
      				   2183    Total
      
      	CONFIG_TINY_RCU (non-preemptible)
      
      	   text	   data	    bss	    dec	   filename
      	     13	      0	      0	     13	   kernel/rcupdate.o
      	    719	     25	      0	    744	   kernel/rcutiny.o
      				    ---
      				    757    Total
      Requested-by: default avatarLoïc Minier <loic.minier@canonical.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      a57eb940
  27. 10 May, 2010 2 commits
    • Paul E. McKenney's avatar
      rcu: slim down rcutiny by removing rcu_scheduler_active and friends · bbad9379
      Paul E. McKenney authored
      TINY_RCU does not need rcu_scheduler_active unless CONFIG_DEBUG_LOCK_ALLOC.
      So conditionally compile rcu_scheduler_active in order to slim down
      rcutiny a bit more.  Also gets rid of an EXPORT_SYMBOL_GPL, which is
      responsible for most of the slimming.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      bbad9379
    • Paul E. McKenney's avatar
      rcu: refactor RCU's context-switch handling · 25502a6c
      Paul E. McKenney authored
      The addition of preemptible RCU to treercu resulted in a bit of
      confusion and inefficiency surrounding the handling of context switches
      for RCU-sched and for RCU-preempt.  For RCU-sched, a context switch
      is a quiescent state, pure and simple, just like it always has been.
      For RCU-preempt, a context switch is in no way a quiescent state, but
      special handling is required when a task blocks in an RCU read-side
      critical section.
      
      However, the callout from the scheduler and the outer loop in ksoftirqd
      still calls something named rcu_sched_qs(), whose name is no longer
      accurate.  Furthermore, when rcu_check_callbacks() notes an RCU-sched
      quiescent state, it ends up unnecessarily (though harmlessly, aside
      from the performance hit) enqueuing the current task if it happens to
      be running in an RCU-preempt read-side critical section.  This not only
      increases the maximum latency of scheduler_tick(), it also needlessly
      increases the overhead of the next outermost rcu_read_unlock() invocation.
      
      This patch addresses this situation by separating the notion of RCU's
      context-switch handling from that of RCU-sched's quiescent states.
      The context-switch handling is covered by rcu_note_context_switch() in
      general and by rcu_preempt_note_context_switch() for preemptible RCU.
      This permits rcu_sched_qs() to handle quiescent states and only quiescent
      states.  It also reduces the maximum latency of scheduler_tick(), though
      probably by much less than a microsecond.  Finally, it means that tasks
      within preemptible-RCU read-side critical sections avoid incurring the
      overhead of queuing unless there really is a context switch.
      Suggested-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      25502a6c