Skip to content
Snippets Groups Projects
  1. Jun 13, 2011
  2. May 26, 2011
    • Paul E. McKenney's avatar
      rcu: Decrease memory-barrier usage based on semi-formal proof · 23b5c8fa
      Paul E. McKenney authored
      
      (Note: this was reverted, and is now being re-applied in pieces, with
      this being the fifth and final piece.  See below for the reason that
      it is now felt to be safe to re-apply this.)
      
      Commit d09b62df fixed grace-period synchronization, but left some smp_mb()
      invocations in rcu_process_callbacks() that are no longer needed, but
      sheer paranoia prevented them from being removed.  This commit removes
      them and provides a proof of correctness in their absence.  It also adds
      a memory barrier to rcu_report_qs_rsp() immediately before the update to
      rsp->completed in order to handle the theoretical possibility that the
      compiler or CPU might move massive quantities of code into a lock-based
      critical section.  This also proves that the sheer paranoia was not
      entirely unjustified, at least from a theoretical point of view.
      
      In addition, the old dyntick-idle synchronization depended on the fact
      that grace periods were many milliseconds in duration, so that it could
      be assumed that no dyntick-idle CPU could reorder a memory reference
      across an entire grace period.  Unfortunately for this design, the
      addition of expedited grace periods breaks this assumption, which has
      the unfortunate side-effect of requiring atomic operations in the
      functions that track dyntick-idle state for RCU.  (There is some hope
      that the algorithms used in user-level RCU might be applied here, but
      some work is required to handle the NMIs that user-space applications
      can happily ignore.  For the short term, better safe than sorry.)
      
      This proof assumes that neither compiler nor CPU will allow a lock
      acquisition and release to be reordered, as doing so can result in
      deadlock.  The proof is as follows:
      
      1.	A given CPU declares a quiescent state under the protection of
      	its leaf rcu_node's lock.
      
      2.	If there is more than one level of rcu_node hierarchy, the
      	last CPU to declare a quiescent state will also acquire the
      	->lock of the next rcu_node up in the hierarchy,  but only
      	after releasing the lower level's lock.  The acquisition of this
      	lock clearly cannot occur prior to the acquisition of the leaf
      	node's lock.
      
      3.	Step 2 repeats until we reach the root rcu_node structure.
      	Please note again that only one lock is held at a time through
      	this process.  The acquisition of the root rcu_node's ->lock
      	must occur after the release of that of the leaf rcu_node.
      
      4.	At this point, we set the ->completed field in the rcu_state
      	structure in rcu_report_qs_rsp().  However, if the rcu_node
      	hierarchy contains only one rcu_node, then in theory the code
      	preceding the quiescent state could leak into the critical
      	section.  We therefore precede the update of ->completed with a
      	memory barrier.  All CPUs will therefore agree that any updates
      	preceding any report of a quiescent state will have happened
      	before the update of ->completed.
      
      5.	Regardless of whether a new grace period is needed, rcu_start_gp()
      	will propagate the new value of ->completed to all of the leaf
      	rcu_node structures, under the protection of each rcu_node's ->lock.
      	If a new grace period is needed immediately, this propagation
      	will occur in the same critical section that ->completed was
      	set in, but courtesy of the memory barrier in #4 above, is still
      	seen to follow any pre-quiescent-state activity.
      
      6.	When a given CPU invokes __rcu_process_gp_end(), it becomes
      	aware of the end of the old grace period and therefore makes
      	any RCU callbacks that were waiting on that grace period eligible
      	for invocation.
      
      	If this CPU is the same one that detected the end of the grace
      	period, and if there is but a single rcu_node in the hierarchy,
      	we will still be in the single critical section.  In this case,
      	the memory barrier in step #4 guarantees that all callbacks will
      	be seen to execute after each CPU's quiescent state.
      
      	On the other hand, if this is a different CPU, it will acquire
      	the leaf rcu_node's ->lock, and will again be serialized after
      	each CPU's quiescent state for the old grace period.
      
      On the strength of this proof, this commit therefore removes the memory
      barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
      The effect is to reduce the number of memory barriers by one and to
      reduce the frequency of execution from about once per scheduling tick
      per CPU to once per grace period.
      
      This was reverted do to hangs found during testing by Yinghai Lu and
      Ingo Molnar.  Frederic Weisbecker supplied Yinghai with tracing that
      located the underlying problem, and Frederic also provided the fix.
      
      The underlying problem was that the HARDIRQ_ENTER() macro from
      lib/locking-selftest.c invoked irq_enter(), which in turn invokes
      rcu_irq_enter(), but HARDIRQ_EXIT() invoked __irq_exit(), which
      does not invoke rcu_irq_exit().  This situation resulted in calls
      to rcu_irq_enter() that were not balanced by the required calls to
      rcu_irq_exit().  Therefore, after these locking selftests completed,
      RCU's dyntick-idle nesting count was a large number (for example,
      72), which caused RCU to to conclude that the affected CPU was not in
      dyntick-idle mode when in fact it was.
      
      RCU would therefore incorrectly wait for this dyntick-idle CPU, resulting
      in hangs.
      
      In contrast, with Frederic's patch, which replaces the irq_enter()
      in HARDIRQ_ENTER() with an __irq_enter(), these tests don't ever call
      either rcu_irq_enter() or rcu_irq_exit(), which works because the CPU
      running the test is already marked as not being in dyntick-idle mode.
      This means that the rcu_irq_enter() and rcu_irq_exit() calls and RCU
      then has no problem working out which CPUs are in dyntick-idle mode and
      which are not.
      
      The reason that the imbalance was not noticed before the barrier patch
      was applied is that the old implementation of rcu_enter_nohz() ignored
      the nesting depth.  This could still result in delays, but much shorter
      ones.  Whenever there was a delay, RCU would IPI the CPU with the
      unbalanced nesting level, which would eventually result in rcu_enter_nohz()
      being called, which in turn would force RCU to see that the CPU was in
      dyntick-idle mode.
      
      The reason that very few people noticed the problem is that the mismatched
      irq_enter() vs. __irq_exit() occured only when the kernel was built with
      CONFIG_DEBUG_LOCKING_API_SELFTESTS.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      23b5c8fa
  3. May 19, 2011
  4. May 06, 2011
    • Paul E. McKenney's avatar
      rcu: Add forward-progress diagnostic for per-CPU kthreads · 5ece5bab
      Paul E. McKenney authored
      
      Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s
      service loop, and add it to the rcudata trace output.
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      5ece5bab
    • Paul E. McKenney's avatar
      rcu: add grace-period age and more kthread state to tracing · 15ba0ba8
      Paul E. McKenney authored
      
      This commit adds the age in jiffies of the current grace period along
      with the duration in jiffies of the longest grace period since boot
      to the rcu/rcugp debugfs file.  It also adds an additional "O" state
      to kthread tracing to differentiate between the kthread waiting due to
      having nothing to do on the one hand and waiting due to being on the
      wrong CPU on the other hand.
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      15ba0ba8
    • Paul E. McKenney's avatar
      rcu: update tracing documentation for new rcutorture and rcuboost · 90e6ac36
      Paul E. McKenney authored
      
      This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost
      trace files.  The description has been updated as suggested by Josh
      Triplett.
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      90e6ac36
    • Paul E. McKenney's avatar
      rcu: add callback-queue information to rcudata output · 0ac3d136
      Paul E. McKenney authored
      
      This commit adds an indication of the state of the callback queue using
      a string of four characters following the "ql=" integer queue length.
      The first character is "N" if there are callbacks that have been
      queued that are not yet ready to be handled by the next grace period, or
      "." otherwise.  The second character is "R" if there are callbacks queued
      that are ready to be handled by the next grace period, or "." otherwise.
      The third character is "W" if there are callbacks waiting for the current
      grace period, or "." otherwise.  Finally, the fourth character is "D"
      if there are callbacks that have been handled by a prior grace period
      and are waiting to be invoked, or ".".
      
      Note that callbacks that are in the process of being invoked are
      not shown.  These callbacks would have been removed from the rcu_data
      structure's list by rcu_do_batch() prior to being executed.  (These
      callbacks are also not reflected in the "ql=" total, FWIW.)
      
      Also, document the new callback-queue trace information.
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      0ac3d136
    • Paul E. McKenney's avatar
      rcu: Update RCU's trace.txt documentation for new format · 2fa218d8
      Paul E. McKenney authored
      
      The trace.txt file had obsolete output for the debugfs rcu/rcudata
      file, so update it.
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      2fa218d8
    • Paul E. McKenney's avatar
      rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists · 12f5f524
      Paul E. McKenney authored
      
      Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the
      rcu_node structure into a single ->blkd_tasks list with ->gp_tasks
      and ->exp_tasks tail pointers.  This is in preparation for RCU priority
      boosting, which will add a third dimension to the combinatorial explosion
      in the ->blocked_tasks[] case, but simply a third pointer in the new
      ->blkd_tasks case.
      
      Also update documentation to reflect blocked_tasks[] merge
      
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      12f5f524
    • Paul E. McKenney's avatar
      rcu: Decrease memory-barrier usage based on semi-formal proof · e59fb312
      Paul E. McKenney authored
      
      Commit d09b62df fixed grace-period synchronization, but left some smp_mb()
      invocations in rcu_process_callbacks() that are no longer needed, but
      sheer paranoia prevented them from being removed.  This commit removes
      them and provides a proof of correctness in their absence.  It also adds
      a memory barrier to rcu_report_qs_rsp() immediately before the update to
      rsp->completed in order to handle the theoretical possibility that the
      compiler or CPU might move massive quantities of code into a lock-based
      critical section.  This also proves that the sheer paranoia was not
      entirely unjustified, at least from a theoretical point of view.
      
      In addition, the old dyntick-idle synchronization depended on the fact
      that grace periods were many milliseconds in duration, so that it could
      be assumed that no dyntick-idle CPU could reorder a memory reference
      across an entire grace period.  Unfortunately for this design, the
      addition of expedited grace periods breaks this assumption, which has
      the unfortunate side-effect of requiring atomic operations in the
      functions that track dyntick-idle state for RCU.  (There is some hope
      that the algorithms used in user-level RCU might be applied here, but
      some work is required to handle the NMIs that user-space applications
      can happily ignore.  For the short term, better safe than sorry.)
      
      This proof assumes that neither compiler nor CPU will allow a lock
      acquisition and release to be reordered, as doing so can result in
      deadlock.  The proof is as follows:
      
      1.	A given CPU declares a quiescent state under the protection of
      	its leaf rcu_node's lock.
      
      2.	If there is more than one level of rcu_node hierarchy, the
      	last CPU to declare a quiescent state will also acquire the
      	->lock of the next rcu_node up in the hierarchy,  but only
      	after releasing the lower level's lock.  The acquisition of this
      	lock clearly cannot occur prior to the acquisition of the leaf
      	node's lock.
      
      3.	Step 2 repeats until we reach the root rcu_node structure.
      	Please note again that only one lock is held at a time through
      	this process.  The acquisition of the root rcu_node's ->lock
      	must occur after the release of that of the leaf rcu_node.
      
      4.	At this point, we set the ->completed field in the rcu_state
      	structure in rcu_report_qs_rsp().  However, if the rcu_node
      	hierarchy contains only one rcu_node, then in theory the code
      	preceding the quiescent state could leak into the critical
      	section.  We therefore precede the update of ->completed with a
      	memory barrier.  All CPUs will therefore agree that any updates
      	preceding any report of a quiescent state will have happened
      	before the update of ->completed.
      
      5.	Regardless of whether a new grace period is needed, rcu_start_gp()
      	will propagate the new value of ->completed to all of the leaf
      	rcu_node structures, under the protection of each rcu_node's ->lock.
      	If a new grace period is needed immediately, this propagation
      	will occur in the same critical section that ->completed was
      	set in, but courtesy of the memory barrier in #4 above, is still
      	seen to follow any pre-quiescent-state activity.
      
      6.	When a given CPU invokes __rcu_process_gp_end(), it becomes
      	aware of the end of the old grace period and therefore makes
      	any RCU callbacks that were waiting on that grace period eligible
      	for invocation.
      
      	If this CPU is the same one that detected the end of the grace
      	period, and if there is but a single rcu_node in the hierarchy,
      	we will still be in the single critical section.  In this case,
      	the memory barrier in step #4 guarantees that all callbacks will
      	be seen to execute after each CPU's quiescent state.
      
      	On the other hand, if this is a different CPU, it will acquire
      	the leaf rcu_node's ->lock, and will again be serialized after
      	each CPU's quiescent state for the old grace period.
      
      On the strength of this proof, this commit therefore removes the memory
      barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
      The effect is to reduce the number of memory barriers by one and to
      reduce the frequency of execution from about once per scheduling tick
      per CPU to once per grace period.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      e59fb312
    • Paul E. McKenney's avatar
      rcu: Remove conditional compilation for RCU CPU stall warnings · a00e0d71
      Paul E. McKenney authored
      
      The RCU CPU stall warnings can now be controlled using the
      rcu_cpu_stall_suppress boot-time parameter or via the same parameter
      from sysfs.  There is therefore no longer any reason to have
      kernel config parameters for this feature.  This commit therefore
      removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE
      kernel config parameters.  The RCU_CPU_STALL_TIMEOUT parameter remains
      to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter
      remains to allow task-stall information to be suppressed if desired.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      a00e0d71
  5. Mar 04, 2011
  6. Nov 29, 2010
  7. Sep 23, 2010
    • Paul E. McKenney's avatar
      rcu: Add tracing data to support queueing models · 269dcc1c
      Paul E. McKenney authored
      
      The current tracing data is not sufficient to deduce the average time
      that a callback spends waiting for a grace period to end.  Add three
      per-CPU counters recording the number of callbacks invoked (ci), the
      number of callbacks orphaned (co), and the number of callbacks adopted
      (ca).  Given the existing callback queue length (ql), the average wait
      time in absence of CPU hotplug operations is ql/ci.  The units of wait
      time will be in terms of the duration over which ci was measured.
      
      In the presence of CPU hotplug operations, there is room for argument,
      but ql/(ci-co+ca) won't steer you too far wrong.
      
      Also fixes a typo called out by Lucas De Marchi <lucas.de.marchi@gmail.com>.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      269dcc1c
  8. Aug 23, 2010
  9. Aug 20, 2010
  10. Aug 19, 2010
  11. Aug 04, 2010
    • Justin P. Mattock's avatar
      Documentation: update broken web addresses. · 0ea6e611
      Justin P. Mattock authored
      
      Below you will find an updated version from the original series bunching all patches into one big patch
      updating broken web addresses that are located in Documentation/*
      Some of the addresses date as far far back as 1995 etc... so searching became a bit difficult,
      the best way to deal with these is to use web.archive.org to locate these addresses that are outdated.
      Now there are also some addresses pointing to .spec files some are located, but some(after searching
      on the companies site)where still no where to be found. In this case I just changed the address
      to the company site this way the users can contact the company and they can locate them for the users.
      
      Signed-off-by: default avatarJustin P. Mattock <justinmattock@gmail.com>
      Signed-off-by: default avatarThomas Weber <weber@corscience.de>
      Signed-off-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      Cc: Paulo Marques <pmarques@grupopie.com>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: Michael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      0ea6e611
  12. May 10, 2010
  13. May 06, 2010
    • Tejun Heo's avatar
      sched: replace migration_thread with cpu_stop · 969c7921
      Tejun Heo authored
      
      Currently migration_thread is serving three purposes - migration
      pusher, context to execute active_load_balance() and forced context
      switcher for expedited RCU synchronize_sched.  All three roles are
      hardcoded into migration_thread() and determining which job is
      scheduled is slightly messy.
      
      This patch kills migration_thread and replaces all three uses with
      cpu_stop.  The three different roles of migration_thread() are
      splitted into three separate cpu_stop callbacks -
      migration_cpu_stop(), active_load_balance_cpu_stop() and
      synchronize_sched_expedited_cpu_stop() - and each use case now simply
      asks cpu_stop to execute the callback as necessary.
      
      synchronize_sched_expedited() was implemented with private
      preallocated resources and custom multi-cpu queueing and waiting
      logic, both of which are provided by cpu_stop.
      synchronize_sched_expedited_count is made atomic and all other shared
      resources along with the mutex are dropped.
      
      synchronize_sched_expedited() also implemented a check to detect cases
      where not all the callback got executed on their assigned cpus and
      fall back to synchronize_sched().  If called with cpu hotplug blocked,
      cpu_stop already guarantees that and the condition cannot happen;
      otherwise, stop_machine() would break.  However, this patch preserves
      the paranoid check using a cpumask to record on which cpus the stopper
      ran so that it can serve as a bisection point if something actually
      goes wrong theree.
      
      Because the internal execution state is no longer visible,
      rcu_expedited_torture_stats() is removed.
      
      This patch also renames cpu_stop threads to from "stopper/%d" to
      "migration/%d".  The names of these threads ultimately don't matter
      and there's no reason to make unnecessary userland visible changes.
      
      With this patch applied, stop_machine() and sched now share the same
      resources.  stop_machine() is faster without wasting any resources and
      sched migration users are much cleaner.
      
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Josh Triplett <josh@freedesktop.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      969c7921
  14. Apr 14, 2010
  15. Feb 25, 2010
  16. Jan 16, 2010
    • Paul E. McKenney's avatar
      rcu: 1Q2010 update for RCU documentation · 4c54005c
      Paul E. McKenney authored
      
      Add expedited functions.  Review documentation and update
      obsolete verbiage.  Also fix the advice for the RCU CPU-stall
      kernel configuration parameter, and document RCU CPU-stall
      warnings.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      LKML-Reference: <12635142581866-git-send-email->
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4c54005c
  17. Oct 26, 2009
  18. Oct 15, 2009
    • Paul E. McKenney's avatar
      rcu: Update trace.txt documentation for blocked-tasks lists · 0edf1a68
      Paul E. McKenney authored
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      Cc: npiggin@suse.de
      Cc: jens.axboe@oracle.com
      LKML-Reference: <12555405592804-git-send-email->
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0edf1a68
    • Paul E. McKenney's avatar
      rcu: Update trace.txt documentation to reflect recent changes · bd58b430
      Paul E. McKenney authored
      
      o	Remove the CONFIG_PREEMPT_RCU documentation since this
      	config option has now been removed.
      
      o	Change the now-incorrect references to "rcu" labels to
      	instead be "rcu_sched".
      
      o	Add notes stating that CONFIG_TREE_PREEMPT_RCU kernels will
      	have additional "rcu_preempt" output.
      
      o	Note the new "oqlen" field in the rcuhier output (for
      	RCU callbacks orphaned by an offlined CPU).
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josh@joshtriplett.org
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      Cc: Valdis.Kletnieks@vt.edu
      Cc: dhowells@redhat.com
      Cc: npiggin@suse.de
      Cc: jens.axboe@oracle.com
      LKML-Reference: <1255540559799-git-send-email->
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      bd58b430
  19. Aug 23, 2009
    • Paul E. McKenney's avatar
      rcu: Remove CONFIG_PREEMPT_RCU · 6b3ef48a
      Paul E. McKenney authored
      
      Now that CONFIG_TREE_PREEMPT_RCU is in place, there is no
      further need for CONFIG_PREEMPT_RCU.  Remove it, along with
      whatever subtle bugs it may (or may not) contain.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <125097461396-git-send-email->
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6b3ef48a
    • Paul E. McKenney's avatar
      rcu: Renamings to increase RCU clarity · d6714c22
      Paul E. McKenney authored
      
      Make RCU-sched, RCU-bh, and RCU-preempt be underlying
      implementations, with "RCU" defined in terms of one of the
      three.  Update the outdated rcu_qsctr_inc() names, as these
      functions no longer increment anything.
      
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: laijs@cn.fujitsu.com
      Cc: dipankar@in.ibm.com
      Cc: akpm@linux-foundation.org
      Cc: mathieu.desnoyers@polymtl.ca
      Cc: josht@linux.vnet.ibm.com
      Cc: dvhltc@us.ibm.com
      Cc: niv@us.ibm.com
      Cc: peterz@infradead.org
      Cc: rostedt@goodmis.org
      LKML-Reference: <12509746132696-git-send-email->
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d6714c22
  20. Jul 16, 2009
    • Eric Dumazet's avatar
      netfilter: nf_conntrack: nf_conntrack_alloc() fixes · 941297f4
      Eric Dumazet authored
      
      When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating
      objects, since slab allocator could give a freed object still used by lockless
      readers.
      
      In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next
      being always valid (ie containing a valid 'nulls' value, or a valid pointer to next
      object in hash chain.)
      
      kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid
      for ct->tuplehash[xxx].hnnode.next.
      
      Fix is to call kmem_cache_alloc() and do the zeroing ourself.
      
      As spotted by Patrick, we also need to make sure lookup keys are committed to
      memory before setting refcount to 1, or a lockless reader could get a reference
      on the old version of the object. Its key re-check could then pass the barrier.
      
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      941297f4
  21. Jul 03, 2009
  22. Jun 12, 2009
  23. Apr 14, 2009
  24. Apr 02, 2009
  25. Mar 10, 2009
Loading