1. 24 Sep, 2014 9 commits
    • Tejun Heo's avatar
      percpu_ref: decouple switching to percpu mode and reinit · f47ad457
      Tejun Heo authored
      percpu_ref has treated the dropping of the base reference and
      switching to atomic mode as an integral operation; however, there's
      nothing inherent tying the two together.
      
      The use cases for percpu_ref have been expanding continuously.  While
      the current init/kill/reinit/exit model can cover a lot, the coupling
      of kill/reinit with atomic/percpu mode switching is turning out to be
      too restrictive for use cases where many percpu_refs are created and
      destroyed back-to-back with only some of them reaching extended
      operation.  The coupling also makes implementing always-atomic debug
      mode difficult.
      
      This patch separates out percpu mode switching into
      percpu_ref_switch_to_percpu() and reimplements percpu_ref_reinit() on
      top of it.
      
      * DEAD still requires ATOMIC.  A dead ref can't be switched to percpu
        mode w/o going through reinit.
      
      v2: __percpu_ref_switch_to_percpu() was missing static.  Fixed.
          Reported by Fengguang aka kbuild test robot.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      f47ad457
    • Tejun Heo's avatar
      percpu_ref: decouple switching to atomic mode and killing · 490c79a6
      Tejun Heo authored
      percpu_ref has treated the dropping of the base reference and
      switching to atomic mode as an integral operation; however, there's
      nothing inherent tying the two together.
      
      The use cases for percpu_ref have been expanding continuously.  While
      the current init/kill/reinit/exit model can cover a lot, the coupling
      of kill/reinit with atomic/percpu mode switching is turning out to be
      too restrictive for use cases where many percpu_refs are created and
      destroyed back-to-back with only some of them reaching extended
      operation.  The coupling also makes implementing always-atomic debug
      mode difficult.
      
      This patch separates out atomic mode switching into
      percpu_ref_switch_to_atomic() and reimplements
      percpu_ref_kill_and_confirm() on top of it.
      
      * The handling of __PERCPU_REF_ATOMIC and __PERCPU_REF_DEAD is now
        differentiated.  Among get/put operations, percpu_ref_tryget_live()
        is the only one which cares about DEAD.
      
      * percpu_ref_switch_to_atomic() can be called multiple times on the
        same ref.  This means that multiple @confirm_switch may get queued
        up which we can't do reliably without extra memory area.  This is
        handled by making the later invocation synchronously wait for the
        completion of the previous one.  This isn't particularly desirable
        but such synchronous waits shouldn't happen in most cases.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      490c79a6
    • Tejun Heo's avatar
      percpu_ref: add PCPU_REF_DEAD · 27344a90
      Tejun Heo authored
      percpu_ref will be restructured so that percpu/atomic mode switching
      and reference killing are dedoupled.  In preparation, add
      PCPU_REF_DEAD and PCPU_REF_ATOMIC_DEAD which is OR of ATOMIC and DEAD.
      For now, ATOMIC and DEAD are changed together and all PCPU_REF_ATOMIC
      uses are converted to PCPU_REF_ATOMIC_DEAD without causing any
      behavior changes.
      
      percpu_ref_init() now specifies an explicit alignment when allocating
      the percpu counters so that the pointer has enough unused low bits to
      accomodate the flags.  Note that one flag was fine as min alignment
      for percpu memory is 2 bytes but two flags are already too many for
      the natural alignment of unsigned longs on archs like cris and m68k.
      
      v2: The original patch had BUILD_BUG_ON() which triggers if unsigned
          long's alignment isn't enough to accomodate the flags, which
          triggered on cris and m64k.  percpu_ref_init() updated to specify
          the required alignment explicitly.  Reported by Fengguang.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      Cc: kbuild test robot <fengguang.wu@intel.com>
      27344a90
    • Tejun Heo's avatar
      percpu_ref: rename things to prepare for decoupling percpu/atomic mode switch · 9e804d1f
      Tejun Heo authored
      percpu_ref will be restructured so that percpu/atomic mode switching
      and reference killing are dedoupled.  In preparation, do the following
      renames.
      
      * percpu_ref->confirm_kill	-> percpu_ref->confirm_switch
      * __PERCPU_REF_DEAD		-> __PERCPU_REF_ATOMIC
      * __percpu_ref_alive()		-> __ref_is_percpu()
      
      This patch is pure rename and doesn't introduce any functional
      changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      9e804d1f
    • Tejun Heo's avatar
      percpu_ref: replace pcpu_ prefix with percpu_ · eecc16ba
      Tejun Heo authored
      percpu_ref uses pcpu_ prefix for internal stuff and percpu_ for
      externally visible ones.  This is the same convention used in the
      percpu allocator implementation.  It works fine there but percpu_ref
      doesn't have too much internal-only stuff and scattered usages of
      pcpu_ prefix are confusing than helpful.
      
      This patch replaces all pcpu_ prefixes with percpu_.  This is pure
      rename and there's no functional change.  Note that PCPU_REF_DEAD is
      renamed to __PERCPU_REF_DEAD to signify that the flag is internal.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      eecc16ba
    • Tejun Heo's avatar
      percpu_ref: minor code and comment updates · 6251f997
      Tejun Heo authored
      * Some comments became stale.  Updated.
      * percpu_ref_tryget() unnecessarily initializes @ret.  Removed.
      * A blank line removed from percpu_ref_kill_rcu().
      * Explicit function name in a WARN format string replaced with __func__.
      * WARN_ON() in percpu_ref_reinit() converted to WARN_ON_ONCE().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      6251f997
    • Tejun Heo's avatar
      percpu_ref: relocate percpu_ref_reinit() · a2237370
      Tejun Heo authored
      percpu_ref is gonna go through restructuring.  Move
      percpu_ref_reinit() after percpu_ref_kill_and_confirm().  This will
      make later changes easier to follow and result in cleaner
      organization.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarKent Overstreet <kmo@daterainc.com>
      a2237370
    • Tejun Heo's avatar
      Revert "blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe" · 9eca8046
      Tejun Heo authored
      This reverts commit 0a30288d, which
      was a temporary fix for SCSI blk-mq stall issue.  The following
      patches will fix the issue properly by introducing atomic mode to
      percpu_ref.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      9eca8046
    • Tejun Heo's avatar
      blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe · 0a30288d
      Tejun Heo authored
      blk-mq uses percpu_ref for its usage counter which tracks the number
      of in-flight commands and used to synchronously drain the queue on
      freeze.  percpu_ref shutdown takes measureable wallclock time as it
      involves a sched RCU grace period.  This means that draining a blk-mq
      takes measureable wallclock time.  One would think that this shouldn't
      matter as queue shutdown should be a rare event which takes place
      asynchronously w.r.t. userland.
      
      Unfortunately, SCSI probing involves synchronously setting up and then
      tearing down a lot of request_queues back-to-back for non-existent
      LUNs.  This means that SCSI probing may take more than ten seconds
      when scsi-mq is used.
      
      This will be properly fixed by implementing a mechanism to keep
      q->mq_usage_counter in atomic mode till genhd registration; however,
      that involves rather big updates to percpu_ref which is difficult to
      apply late in the devel cycle (v3.17-rc6 at the moment).  As a
      stop-gap measure till the proper fix can be implemented in the next
      cycle, this patch introduces __percpu_ref_kill_expedited() and makes
      blk_mq_freeze_queue() use it.  This is heavy-handed but should work
      for testing the experimental SCSI blk-mq implementation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarChristoph Hellwig <hch@infradead.org>
      Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
      Fixes: add703fd ("blk-mq: use percpu_ref for mq usage count")
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Tested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0a30288d
  2. 19 Sep, 2014 1 commit
    • Tejun Heo's avatar
      percpu-refcount: make percpu_ref based on longs instead of ints · e625305b
      Tejun Heo authored
      percpu_ref is currently based on ints and the number of refs it can
      cover is (1 << 31).  This makes it impossible to use a percpu_ref to
      count memory objects or pages on 64bit machines as it may overflow.
      This forces those users to somehow aggregate the references before
      contributing to the percpu_ref which is often cumbersome and sometimes
      challenging to get the same level of performance as using the
      percpu_ref directly.
      
      While using ints for the percpu counters makes them pack tighter on
      64bit machines, the possible gain from using ints instead of longs is
      extremely small compared to the overall gain from per-cpu operation.
      This patch makes percpu_ref based on longs so that it can be used to
      directly count memory objects or pages.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      e625305b
  3. 07 Sep, 2014 1 commit
    • Tejun Heo's avatar
      percpu-refcount: add @gfp to percpu_ref_init() · a34375ef
      Tejun Heo authored
      Percpu allocator now supports allocation mask.  Add @gfp to
      percpu_ref_init() so that !GFP_KERNEL allocation masks can be used
      with percpu_refs too.
      
      This patch doesn't make any functional difference.
      
      v2: blk-mq conversion was missing.  Updated.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      a34375ef
  4. 28 Jun, 2014 5 commits
    • Tejun Heo's avatar
      percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() · 2d722782
      Tejun Heo authored
      Now that explicit invocation of percpu_ref_exit() is necessary to free
      the percpu counter, we can implement percpu_ref_reinit() which
      reinitializes a released percpu_ref.  This can be used implement
      scalable gating switch which can be drained and then re-opened without
      worrying about memory allocation failures.
      
      percpu_ref_is_zero() is added to be used in a sanity check in
      percpu_ref_exit().  As this function will be useful for other purposes
      too, make it a public interface.
      
      v2: Use smp_read_barrier_depends() instead of smp_load_acquire().  We
          only need data dep barrier and smp_load_acquire() is stronger and
          heavier on some archs.  Spotted by Lai Jiangshan.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      2d722782
    • Tejun Heo's avatar
      percpu-refcount: require percpu_ref to be exited explicitly · 9a1049da
      Tejun Heo authored
      Currently, a percpu_ref undoes percpu_ref_init() automatically by
      freeing the allocated percpu area when the percpu_ref is killed.
      While seemingly convenient, this has the following niggles.
      
      * It's impossible to re-init a released reference counter without
        going through re-allocation.
      
      * In the similar vein, it's impossible to initialize a percpu_ref
        count with static percpu variables.
      
      * We need and have an explicit destructor anyway for failure paths -
        percpu_ref_cancel_init().
      
      This patch removes the automatic percpu counter freeing in
      percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a
      generic destructor now named percpu_ref_exit().  percpu_ref_destroy()
      is considered but it gets confusing with percpu_ref_kill() while
      "exit" clearly indicates that it's the counterpart of
      percpu_ref_init().
      
      All percpu_ref_cancel_init() users are updated to invoke
      percpu_ref_exit() instead and explicit percpu_ref_exit() calls are
      added to the destruction path of all percpu_ref users.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Li Zefan <lizefan@huawei.com>
      9a1049da
    • Tejun Heo's avatar
      percpu-refcount: use unsigned long for pcpu_count pointer · 7d742075
      Tejun Heo authored
      percpu_ref->pcpu_count is a percpu pointer with a status flag in its
      lowest bit.  As such, it always goes through arithmetic operations
      which is very cumbersome to do on a pointer.  It has to be first
      casted to unsigned long and then back.
      
      Let's just make the field unsigned long so that we can skip the first
      casts.  While at it, rename it to pcpu_counter_ptr to clarify that
      it's a pointer value.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      7d742075
    • Tejun Heo's avatar
      percpu-refcount: add helpers for ->percpu_count accesses · eae7975d
      Tejun Heo authored
      * All four percpu_ref_*() operations implemented in the header file
        perform the same operation to determine whether the percpu_ref is
        alive and extract the percpu pointer.  Factor out the common logic
        into __pcpu_ref_alive().  This doesn't change the generated code.
      
      * There are a couple places in percpu-refcount.c which masks out
        PCPU_REF_DEAD to obtain the percpu pointer.  Factor it out into
        pcpu_count_ptr().
      
      * The above changes make the WARN_ON_ONCE() conditional at the top of
        percpu_ref_kill_and_confirm() the only user of REF_STATUS().  Test
        PCPU_REF_DEAD directly and remove REF_STATUS().
      
      This patch doesn't introduce any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      eae7975d
    • Tejun Heo's avatar
      percpu-refcount: one bit is enough for REF_STATUS · d630dc4c
      Tejun Heo authored
      percpu-refcount currently reserves two lowest bits of its percpu
      pointer to indicate its state; however, only one bit is used for
      PCPU_REF_DEAD.
      
      Simplify it by removing PCPU_STATUS_BITS/MASK and testing
      PCPU_REF_DEAD directly.  This also allows the compiler to choose a
      more efficient instruction depending on the architecture.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      d630dc4c
  5. 04 Jun, 2014 1 commit
    • Sebastian Ott's avatar
      percpu-refcount: fix usage of this_cpu_ops · 0c36b390
      Sebastian Ott authored
      The percpu-refcount infrastructure uses the underscore variants of
      this_cpu_ops in order to modify percpu reference counters.
      (e.g. __this_cpu_inc()).
      
      However the underscore variants do not atomically update the percpu
      variable, instead they may be implemented using read-modify-write
      semantics (more than one instruction).  Therefore it is only safe to
      use the underscore variant if the context is always the same (process,
      softirq, or hardirq). Otherwise it is possible to lose updates.
      
      This problem is something that Sebastian has seen within the aio
      subsystem which uses percpu refcounters both in process and softirq
      context leading to reference counts that never dropped to zeroes; even
      though the number of "get" and "put" calls matched.
      
      Fix this by using the non-underscore this_cpu_ops variant which
      provides correct per cpu atomic semantics and fixes the corrupted
      reference counts.
      
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: <stable@vger.kernel.org> # v3.11+
      Reported-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      References: http://lkml.kernel.org/g/alpine.LFD.2.11.1406041540520.21183@denkbrett
      0c36b390
  6. 09 May, 2014 2 commits
  7. 16 Jun, 2013 1 commit
    • Tejun Heo's avatar
      percpu-refcount: use RCU-sched insted of normal RCU · a4244454
      Tejun Heo authored
      percpu-refcount was incorrectly using preempt_disable/enable() for RCU
      critical sections against call_rcu().  6a24474d ("percpu-refcount:
      consistently use plain (non-sched) RCU") fixed it by converting the
      preepmtion operations with rcu_read_[un]lock() citing that there isn't
      any advantage in using sched-RCU over using the usual one; however,
      rcu_read_[un]lock() for the preemptible RCU implementation -
      CONFIG_TREE_PREEMPT_RCU, chosen when CONFIG_PREEMPT - are slightly
      more expensive than preempt_disable/enable().
      
      In a contrived microbench which repeats the followings,
      
       - percpu_ref_get()
       - copy 32 bytes of data into percpu buffer
       - percpu_put_get()
       - copy 32 bytes of data into percpu buffer
      
      rcu_read_[un]lock() used in percpu_ref_get/put() makes it go slower by
      about 15% when compared to using sched-RCU.
      
      As the RCU critical sections are extremely short, using sched-RCU
      shouldn't have any latency implications.  Convert to RCU-sched.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarKent Overstreet <koverstreet@google.com>
      Acked-by: default avatar"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      a4244454
  8. 13 Jun, 2013 3 commits
    • Tejun Heo's avatar
      percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm() · dbece3a0
      Tejun Heo authored
      Implement percpu_tryget() which stops giving out references once the
      percpu_ref is visible as killed.  Because the refcnt is per-cpu,
      different CPUs will start to see a refcnt as killed at different
      points in time and tryget() may continue to succeed on subset of cpus
      for a while after percpu_ref_kill() returns.
      
      For use cases where it's necessary to know when all CPUs start to see
      the refcnt as dead, percpu_ref_kill_and_confirm() is added.  The new
      function takes an extra argument @confirm_kill which is invoked when
      the refcnt is guaranteed to be viewed as killed on all CPUs.
      
      While this isn't the prettiest interface, it doesn't force synchronous
      wait and is much safer than requiring the caller to do its own
      call_rcu().
      
      v2: Patch description rephrased to emphasize that tryget() may
          continue to succeed on some CPUs after kill() returns as suggested
          by Kent.
      
      v3: Function comment in percpu_ref_kill_and_confirm() updated warning
          people to not depend on the implied RCU grace period from the
          confirm callback as it's an implementation detail.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Slightly-Grumpily-Acked-by: default avatarKent Overstreet <koverstreet@google.com>
      dbece3a0
    • Tejun Heo's avatar
      percpu-refcount: implement percpu_ref_cancel_init() · bc497bd3
      Tejun Heo authored
      Normally, percpu_ref_init() initializes and percpu_ref_kill()
      initiates destruction which completes asynchronously.  The
      asynchronous destruction can be problematic in init failure path where
      the caller wants to destroy half-constructed object - distinguishing
      half-constructed objects from the usual release method can be painful
      for complex objects.
      
      This patch implements percpu_ref_cancel_init() which synchronously
      destroys the percpu_ref without invoking release.  To avoid
      unintentional misuses, the function requires the ref to have finished
      percpu_ref_init() but never used and triggers WARN otherwise.
      
      v2: Explain the weird name and usage restriction in the function
          comment.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarKent Overstreet <koverstreet@google.com>
      bc497bd3
    • Tejun Heo's avatar
      percpu-refcount: add __must_check to percpu_ref_init() and don't use... · acac7883
      Tejun Heo authored
      percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu()
      
      Two small changes.
      
      * Unlike most init functions, percpu_ref_init() allocates memory and
        may fail.  Let's mark it with __must_check in case the caller
        forgets.
      
      * percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to
        dereference @ref->pcpu_count, which can be misleading.  The pointer
        is guaranteed to be valid and visible and can't change underneath
        the function.  Drop ACCESS_ONCE().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      acac7883
  9. 12 Jun, 2013 2 commits
  10. 03 Jun, 2013 1 commit
    • Kent Overstreet's avatar
      percpu: implement generic percpu refcounting · 215e262f
      Kent Overstreet authored
      This implements a refcount with similar semantics to
      atomic_get()/atomic_dec_and_test() - but percpu.
      
      It also implements two stage shutdown, as we need it to tear down the
      percpu counts.  Before dropping the initial refcount, you must call
      percpu_ref_kill(); this puts the refcount in "shutting down mode" and
      switches back to a single atomic refcount with the appropriate
      barriers (synchronize_rcu()).
      
      It's also legal to call percpu_ref_kill() multiple times - it only
      returns true once, so callers don't have to reimplement shutdown
      synchronization.
      
      [akpm@linux-foundation.org: fix build]
      [akpm@linux-foundation.org: coding-style tweak]
      Signed-off-by: default avatarKent Overstreet <koverstreet@google.com>
      Cc: Zach Brown <zab@redhat.com>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Reviewed-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      215e262f